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Abstract 

The shift towards an energy Grid dominated by prosumers (consumers and producers of energy) will 
inevitably have repercussions on the distribution infrastructure. Today it is a hierarchical one designed 
to deliver energy from large scale facilities to end-users. Tomorrow it will be a capillary infrastructure at 
the Medium and Low Voltage levels that will support local energy trading among prosumers. In |74| . we 
analyzed the Dutch Power Grid and made an initial analysis of the economic impact topological properties 
have on decentralized energy trading. In this paper, we go one step further and investigate how different 
networks topologies and growth models facilitate the emergence of a decentralized market. In particular, 
we show how the connectivity plays an important role in improving the properties of reliability and path- 
cost reduction. From the economic point of view, we estimate how the topological evolutions facilitate 
local electricity distribution, taking into account the main cost ingredient required for increasing network 
connectivity, i.e., the price of cabling. 

Keywords: Power Grid, Decentralized energy trading, Complex Network Analysis 

1 Introduction 

Something is changing both in the way energy is produced and distributed due to the combined effects of 
technological advancements and introduction of new policies. In the last decades a clear trend has invested 
the energy sector, that of unbundling. That is the process of dismantling monopolistic and oligarchic energy 
system, by allowing a greater number of parties to operate in a certain role of the energy sector and market. The 
goal of unbundling is that of reducing costs for the end-users and providing better services through competition 
(e.g., [2H1 [H]). At the same time from the technological perspective, new energy generation facilities (mainly 
based on renewable sources) are becoming more and more accessible. These are increasingly convenient and 
available at both the industrial and the residential scale [SH [62] . The term Smart Grid, which does not yet 
have a unique agreed definition [IBJ 69J, is sometimes used to define the new scenario of a Grid with a high 
degree of derealization in the production and exchange of energy. The new actors, who are both producers and 
consumers of energy, also known as prosumers, are increasing in number and will most likely demand a market 
with total freedom for energy trading [SS] . In this coming scenario, the main role of the High Voltage Grid may 
change, while the Distribution Grid (i.e., Medium Voltage and Low Voltage end of the Power Grid) becomes 
more and more important, while requiring a major update. In fact, the energy interactions between prosumers 
will increase and most likely occur at a rather local level, therefore involving the Low and Medium Voltage Grids. 
This evolution of the energy sector will inevitably call for an upgrade of the enabling distribution infrastructure 
so to facilitate local energy exchanges. An infrastructure comparable more to a "peer-to-peer" system on the 
Internet, rather than the current strictly hierarchical system. But how will the infrastructure evolve to enable 
and follow this trend? 

In [HQ we laid the foundation for a statistical study of the Medium and Low Voltage Grid with the aim of 
identifying salient topological properties of the Power Grid that affect decentralized energy exchange. We based 
that study on real samples of the Dutch Grid and provided an initial economic analysis of the possible barriers 
to delocalized trades. In this follow up paper, we go one step beyond and consider growth models for network 
topologies providing an analysis of which models suit best the purpose of local energy exchange. The tool for 
our study is Complex Network Analysis (CNA) [6]. In particular, in the present case we use CNA as a synthesis 
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tool by synthesizing networks using topological models coming from the literature of modeling technological, 
infrastructural and social network evolutions. 

In order to evaluate the adequacy of the generated networks, we develop a set of metrics that capture the var- 
ious aspects that networks suited for small-scale energy exchange need to satisfy. It is then quite straightforward 
to compare the results of the synthetic models with the real samples analyzed in [74 and on that ground propose 
network models that best suit a prosumer-based local energy exchange. Finally, a quantitative evaluation of how 
the improvement in the topology directly influences electricity prices is then possible. 

We remark the novelty of this proposal with respect to previous CNA studies of the Power Grid. In fact, CNA 
has been used only on the High Voltage networks to get information on resilience to failures, and the Medium 
and Low Voltage Grids have been mostly ignored. Another novelty is the use of Complex Network Analysis 
not as a tool for pure analysis of the existing, but to exploit it as a design tool for an infrastructure. Using 
Graph Theory in the design of distribution systems is not completely new, several studies have incorporated 
Graph Theory elements in operation research techniques for Grid planning [29 , 90 , but never, to the best of our 
knowledge, has Graph Theory been combined with global statistical measures to design the Grid. In addition, 
we ground the design methods to investments by taking into account costs of Grid cabling based on the types of 
cables typically used in real Distribution Networks (i.e., Norther Netherlands Medium and Low Voltage network 
samples). In summary, the paper proposes which topologies according to CNA-based metrics are best suited in 
terms of performance and reliability of the infrastructure for a local energy exchange, give an estimation of the 
cabling cost for the realization of such topologies and assess the advantages from the electricity distribution point 
of view of the proposed topologies compared to the actual ones. 

The paper is organized as follows. We open by analyzing the motivations for a new energy landscape and 
the required changes to the current Grid in Section [2j The background of Graph Theory necessary for the 
present study is presented in Section [3] Section [4] describes the main properties of the graph models; while the 
metrics exploited to compare the properties of the various generated graphs are described in Section [5] The 
analysis and discussion of the results is presented in Section (6) An overall discussion and illustration taking 
into account benefits and costs of evolution of topologies are considered in Sections [7j Section [8] reviews the 
main approaches to Electrical Grid and System design and evolution, while Section [9] provides a conclusion of 
the paper. A series of appendixes is included to provide extended coverage of topics related to the core of the 
paper, in particular, |Appcndix A| describes statistical properties of power cables' price and resistance; |Appcndix| 
[B] provides an overview about the relationship between network topology a electricity price; appendix section 
concludes with | Appendix C| which describes an engineering process based on a Complex Network Analysis that 
can guide Grid and energy operators to shape their networks for the new local energy exchange paradigm. 

2 The Need for Evolution of the Grid 

In the XIX century electrical energy generation was considered a natural monopoly. The cheapest way to produce 
electricity was in big power plants and then transmitting it across a country through a pervasive network of cables 
operated by a monopolistic state owned company. The situation has changed and now more and more companies 
are present in the energy business from energy production, to energy transmission and distribution, to retail and 
service provided to the end-user. To enable and accelerate this process, governments in the western world have 
promoted policies to open the electricity business and facilitate competition with the final aim to both modernize 
the energy sector and provide a more convenient service for the end-user. Even more on this path of enabling 
everybody to be a producer of energy is the possibility (sometimes incentivized with governments' policies) to 
have small-scale energy generation units such as photovoltaic panels, small-wind turbines and micro combined 
heat-and-power systems (micro-CHP) which are now all widely available and affordable for the end-user market. 
Such small-scale approach is beneficial to the electricity system in many ways: from reduced losses since source 
and load are closer, to system modularity, to smaller investments compared to large-scale energy solutions [62J. 
Local generation based on renewables is a boost for the transition towards a renewable-based energy supply. In 
fact, end-users generate their own energy and the additional supply is likely to be provided by other end-users in 
the neighborhood that have energy surplus generated by their renewable-based generating equipment. In such a 
context with many small-scale producers and still without an efficient and cheap energy storage technology a local 
energy exchange at the neighborhood or municipal level between end-users is foreseeable and desirable. Micro- 
grids increased performance in terms of reduced losses and power quality have been successfully tested [571 179j , 
but little attention has been devoted to the network topology of these type of Grids. 

In the evolution of the actual electricity system to the Internet of Energy [3TJ [58] the end-users are producers 
in addition to the normal consumer role they have always had. Energy is then something that the user does no 
more need to negotiate through yearly or lifetime contracts, but can be traded between prosumers and consumers 
on a fully electronic and automated market. These energy exchanges are likely to be local inside a neighborhood, 
a village or a city where users that have small-scale renewable-based producing facilities can sell their energy 
surplus. This solution represents a "win- win" solution first for the environment envisioning more energy generated 
with renewable sources, and second for the prosumer who sells his surplus energy on the market obtaining some 
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profits out of it. This latter aspect helps accelerating the return on investment made in purchasing the generating 
equipment. A benefit is also for the end- user that has more flexibility in choosing his energy provider and takes 
advantage of the cheaper tariffs of the prosumers. The traditional energy providers and distributors still play 
important roles even in this paradigm: the former provide traditional supply where or when prosumers are not 
available, the latter has even a more critical role in monitoring and providing a Grid at Medium and Low Voltage 
that is efficient, failure resistant and that satisfies the needs of this new energy exchange paradigm. 

This future scenario might impact deeply on the actual Grid infrastructure especially the Medium and Low 
Voltage section where the prosumer and consumer exchanges will happen. The Grid infrastructure with the 
associated reliability, losses, quality, performances and related transmission costs might act as an enabler or 
repression of the local energy exchange. The Medium and Low Voltage Grid is likely to face important changes 
in its infrastructure to support Smart Grid [17] and even more in enabling a scenario where energy producers 
are many and the interactions are at local scale. Usually, the lower end of the Grid have been considered of small 
importance and less critical than the High Voltage infrastructures, however the tendency is likely to be reversed 
in a prosumer-based energy paradigm. 

In our study we resort to Complex Network Analysis, a branch of Graph Theory taking its root in the 
early studies of Erdos and Renyi [35] on random graphs and considering statistical structural properties of very 
large graphs. Although taking its root in the past, Complex Network Analysis (CNA) is a relatively young 
field of research. The first systematic studies appeared in the late 1990s [531 E3 El H] having the goal of 
looking at the properties of large networks with a complex systems behavior. Afterwards, Complex Network 
Analysis has been used in many different fields of knowledge, from biology [5T] to chemistry [33], from linguistics 
to social sciences [57], from telephone call patterns [T] to computer networks [3J5] and web [3J [33J to virus 
spreading [SSJ [37] H3J to logistics [SHI EH I2S] and also inter-banking systems [T3] . Men-made infrastructures are 
especially interesting to study under the Complex Network Analysis lenses, especially when they are large scale 
and grow in a decentralized and independent fashion, thus not being the result of a global, but rather of many 
local autonomous designs. The Power Grid is a prominent example. In this work we consider a novel approach 
both in considering Complex Network Analysis tools as a design instrument (i.e., CNA-related metrics are used 
in finding the most suited Medium and Low Voltage Grid for local energy exchange) and in focusing on the 
Medium and Low Voltage layers of the Power Grid. In fact, traditionally, Complex Network Analysis studies 
applied to the Power Grid only evaluate reliability issues and disruption behavior of the Grid when nodes or 
edges of the High Voltage layer are compromised. 

In summary, the requirements of the new Power Grid enabling decentralized trading are: 

1. Realizing the small-scale network paradigm; 

2. Improving local energy exchange; 

3. Supporting renewable-based energy production; 

4. Encouraging the end-user (technically, economically and politically) to buy/sell energy locally; 

5. Realizing networks easy to repeat at different scales (i.e., neighborhood, small village, city, metropolis) 

6. Reducing losses in the Medium and Low Voltage end of the Grid ; and 

7. Enabling smartness in the automation of energy exchanges and their accounting. 

In the above general requirements several are tightly connected with the topology of the network, while others 
are more related to the control and ICT-oriented aspects of the Smart Grid. For the former aspects we provided 
a first investigation in our previous work |74j and in the present work; the latter aspects are out of the scope of 
the present work and can be traced to other investigations such as []1|1 [TH1 HH HE] • 

3 Graph Theory Background 

The approach used in this work to model the Power Grid and its evolution is based on Graph Theory and 
Complex Networks. Here we recall the basic definitions that we use throughout the paper and refer to standard 
textbooks such as [TTJ [T^] for a broader introduction. First we define a graph for the Power Grid [73] . 

Definition 1 (Power Grid graph). A Power Grid graph is a graph G(V,E) such that each element m € V is 
either a substation, transformer, or consuming unit of a physical Power Grid. There is an edge e^j = (vi, Vj) € E 
between two nodes if there is physical cable connecting directly the elements represented by Vi and Vj . 

One can also associate weights to the edges representing physical cable properties (e.g., resistance, voltage, 
supported current flow). 
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Definition 2 (Weighted Power Grid graph). A Weighted Power Grid graph is a Power Grid graph G W (V,E) 
with an additional function f : E — > R associating a real number to an edge representing the physical property of 
the corresponding cable (e.g., the resistance, expressed in Ohm, of the physical cable). 

A first classification of graphs is expressed in terms of their size. 

Definition 3 (Order and size of a graph). Given the graph G the order is given by N — \V\, while the size is 
given by M = \E\. 

From order and size it is possible to have a global value for the connectivity of the vertexes of the graph, 
known as average node degree . That is < k >= To characterize the relationship between a node and the 
others it is connected to, the following properties provide an indication of the bond between them. 

Definition 4 (Adjacency, neighborhood and degree). // e XiV G E is an edge in graph G, then x and y are 
adjacent, or neighboring, vertexes, and the vertexes x and y are incident with the edge e x . y . The set of vertexes 
adjacent to a vertex x G V, called the neighborhood of x, is denoted by T x . The number d(x) = \T X \ is the degree 
of x. 

A measure of the average 'density' of the graph is given by the clustering coefficient, characterizing the extent 
to which vertexes adjacent to any vertex v are adjacent to each other. 

Definition 5 (Clustering coefficient (CC)). The clustering coefficient j v ofT v is 

\E{T V )\ 



lv = 



( fc 2 ") 



where |-E(r„)| is the number of edges in the neighborhood of v and ( 2 " ) is the total number of possible edges 
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This local property of a node can be extended to an entire graph by averaging over all nodes. 

Another important property is how much any two nodes are far apart from each other, in particular the 
minimal distance between them or shortest path. The concepts of path and path length are crucial to understand 
the way two vertexes are connected. 

Definition 6 (Path and path length) . A path of G is a subgraph P of the form: 

V(P) = {x , Xl , . . .,x t }, E(P) = {(x , Xl ), (x 1 ,x 2 ), • • • , (xi-^xi)}. 

such that V(P) C V and E(P) C E. The vertexes x and x ; are end- vertexes of P and I = \E(P)\ is the length 
of P. A graph is connected if for any two distinct vertexes Vi, Vj G V there is a finite path from vi to Vj. 

Definition 7 (Distance). Given a graph G and vertexes Vi and vj, their distance d(vi,Vj) is the minimal length 
of any Vi — Vj path in the graph. If there is no Vi — vj path then it is conventionally set to d(vi, Vj) = oo. 

Definition 8 (Shortest path). Given a graph G and vertexes Vi and Vj the shortest path is the the path corre- 
sponding to the minimum of to the set {\Pi\,\P 2 \, . . . ,\P^\} containing the lengths of all paths for which v-i and 
Vj are the end-vertexes. 

A global measure for a graph is given by its average distance among any two nodes. 

Definition 9 (Average path length (APL)). Let Vi G V be a vertex in graph G. The average path length for G 

L <"> = N-iN-u T,^^ 
where d(vi,Vj) is the finite distance between vi and Vj and N is the order of G. 

Definition 10 (Characteristic path length (CPL)). Let Vi G V be a vertex in graph G, the characteristic path 
length for G, L cp is defined as the median of d v . where: 



is the mean of the distances connecting Vi to any other vertex Vj in G and N is the order of G. 
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To describe the importance of a node with respect to minimal paths in the graph, the concept of betweenness 
helps. Betweenness (sometimes also referred as load) for a given vertex is the number of shortest paths between 
any other nodes that traverse it. 

Definition 11 (Betweenness). The betweenness b(v) of vertex v € V is 

o-st(v) 



b(v) 



, 0~st 



where a st {v) is 1 if the shortest path between vertex s and vertex t goes through vertex v, otherwise and o~ st is 
the number of shortest paths between vertex s and vertex t. 

Looking at large graphs, one is usually interested in global statistical measures rather than the properties of a 
specific node. A typical example is the node degree, where one measures the node degree probability distribution. 

Definition 12 (Node degree distribution). Consider the degree k of a node in a graph as a random variable. 
The function 

N k = {veG: d(v) = k} 
is called probability node degree distribution. 

The shape of the distribution is a salient characteristic of the network. For the Power Grid, the shape 
is typically either exponential or a Power-law [5J [31 [7H [HO]- More precisely, an exponential node degree (k) 
distribution has a fast decay in the probability of having nodes with relative high node degree. The relation: 

P(k) = ae pk 

follows, where a and /3 are parameters of the specific network considered. On the contrary, a Power-law distri- 
bution has a slower decay with higher probability of having nodes with high node degree. It is expressed by the 
relation: 

P(k) = ak^ 

where a and 7 are parameters of the specific network considered. We remark that the graphs considered in the 
Power Grid domain are usually large, although finite, in terms of order and size thus providing limited and finite 
probability distributions. 

A Graph can also be represented as a matrix, typically an adjacency matrix. 

Definition 13 (Adjacency matrix). The adjacency matrix A = A(G) = (aij) of a graph G of order N is the 
N x N matrix given by 

__ J 1 if (vi,Vj) € E, 
I otherwise. 

We have now provided the basic definitions needed to present the modeling tools for the Power Grid evolutions. 



4 Modeling the Power Grid 

To address the question of what are the best suited topologies to characterize the Medium and Low Voltage 
Grids, we study models for graph generation proposed for technological complex networks. For each model we 
evaluate the properties of the network for several values of the order of the graph. Following our analysis of the 
Northern Dutch Medium and Low Voltage [74], we categorize networks as Small, Medium and Large, see Table [TJ 
We then analyze the properties of the networks coming from the generated models by applying relevant Complex 
Network Analysis metrics and combine them appropriately. In this way, Complex Network Analysis is not only 
a tool for analysis, but it becomes a design tool for the future electrical Grid. 

Most studies using Complex Network Analysis focus on extracting properties of networks arising from natural 
phenomena (e.g., food webs |35j . protein interactions [51] . neural networks of microorganism [93 ), and human 
generated networks (computer networks [55] . the web [3], transport systems [47]) to try to understand which 
underlying rules characterize them. Here we look at network models that have proven successful in showing salient 
characteristics of technological networks (i.e, preferential attachment, Copying Model, power-law networks), social 
networks (i.e., small- world, Kronecker graph, recursive matrix) and natural phenomena as well (e.g., Random 
Graph, small-world, Forest Fire) to investigate which one is best suited for supporting local-scale energy exchange 
form a topological point of view. Next we provide a brief introduction to all the models used in the present study, 
while a more in-depth presentation is available for instance in |20j or [72j . 
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Network layer 


Category 


Order 


Low Voltage 


Small 


«20 


Low Voltage 


Medium 


«90 


Low Voltage 


Large 


«200 


Medium Voltage 


Small 


«250 


Medium Voltage 


Medium 


«500 


Medium Voltage 


Large 


«1000 



Table 1: Categories of Medium and Low Voltage network and their order based on [73] . 
Random Graph 

A Random Graph is a graph built by picking nodes under some random probability distribution and connecting 
them by edges. It is due to the pioneering studies of Erdos and Renyi [3S1 [37). More precisely, there are two 
ways to built a Random Graph, (a) the Gm.p model proposed by Erdos and Renyi considers a set of N nodes and 
for each pair of nodes an edge is added with a certain probability p\ (b) the Gn,m model considers with equal 
probability all the graphs having N vertexes and exactly M edges randomly selected among all the possible pairs 
of edges. The models have the same asymptotic properties. We use the Gn,m model since we are interested in 
setting both the number of nodes and edges for the networks to generate. A Random Graph with order 199 and 
size 400 is shown in Figure [T] 




Figure 1: A Random Graph. 



Small-world Graph 

The small-world phenomenon became famous after the works of Milgram in the sociological context [HTJ \E7\ 
who found short chains of acquaintances connecting random people in the USA. More recently, the small-world 
characterization of graphs has been investigated by Watts and Strogatz [93j [94] who showed the presence of the 
small-world property in many types of networks such as actor acquaintances, the Power Grid infrastructure and 
neural networks in worms. It is obtained from a regular lattice that connects the nodes followed by a process of 
rewiring the edges with a certain probability p e [0,1]. The resulting graph has intermediate properties between 
the extreme situations of a regular lattice (p = 0) and a random graph (p = 1). In particular, small- world 
networks hold interesting properties: the characteristic path length is comparable to the one of a corresponding 
random graph {L sw > L ran( i om ), while the clustering coefficient has a value bigger than a random graph and 
closer to the one of a regular lattice (CC SW 3> CC ran dom)- A small-world Graph with order 200 and size 399 is 
shown in Figure [2] 



G 



Figure 2: A small- world graph. 



Preferential Attachment 

The preferential attachment model represents the phenomenon happening in real networks, where a fraction of 
nodes has a high connectivity while the majority of nodes has small node degree. This model is built upon the 
observation by Barabasi and Albert [5] of a typical pattern characterizing several type of natural and artificial 
networks. The basic idea is that whenever a node is added to the network and connects (through edges) to m 
other nodes, those with higher degree are preferred for connection. In other words, the probability to establish 
an edge with an existing node i is given by Il(fcj) = ^ fci , where fcj is the node degree of node i. One can see 

then that the more connected nodes have higher chances to acquire more and more edges over time in a sort of 
"rich gets richer" fashion; a phenomenon studied by Pareto |77] in relation to land ownership. The preferential 
attachment model reaches a stationary solution for the node degree probability that follows a power-law with 
P(k) — t§-. A graph based on preferential attachment with order 200 and size 397 is shown in Figure 3 




Figure 3: A Preferential Attachment graph. 
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R-MAT 



R-MAT (Recursive MATrix) is a model that exploits the representation of a graph through its adjacency ma- 
trix |21j . In particular, it applies a recursive method to create the adjacency matrix of the graph, thus obtaining 
a self-similar graph structure. This model captures the community-based pattern appearing in some real net- 
works. Moreover, the generated graph is characterized by a power-law node degree distribution while showing a 
small diameter. The idea is to start with an empty N x N matrix and then divide the square matrix into four 
partitions in which the nodes are present with a certain probability for each partition, specifically probabilities 
a, b, c, d that sum to one. The procedure is then repeated dividing each partition again in four sub-partitions and 
associating the probabilities. The procedure stops when a 1 x 1 cell is reached in the iterative procedure. The 
a, b, c, d partitions of the adjacency matrix have particular meaning: a and d represent the portions containing 
nodes belonging to different communities, while b and c represent the nodes that act as a link for the different 
communities (e.g., in a social network people with interests both in topics mostly popular in either a or d com- 
munity). The recursive nature of this algorithm creates a sort of sub communities at each round. A graph based 
on R-MAT model with order 222 and size 499 is shown in Figure [4] 



Models Independent from the Average Node Degree 

When generating certain models there is no explicit dependence on the average node degree, these include Random 
Graph with power-law model, Copying Model, Forest Fire and Kronecker Graph which are presented next. 

Random Graph with Power-law A Random Graph with power-law model generates networks characterized 
by a power-law in the node degree distribution (P(k) ~ fc~ 7 ) having the majority of nodes with a low degree 
and a small amount of nodes with a very high degree. Power-law distributions are very common in many real 
life networks both created by natural processes (e.g., food- webs, protein interactions) and by artificial ones (e.g., 
airline travel routes, Internet routing, telephone call graphs), [BJ. The types of networks that follow this property 
are also referred to as Scale-free networks ([HIES]). From the dynamic point of view, these networks are modeled 
by a preferential attachment model. In addition, reliability is a property of these graphs, that is, high degree of 
tolerance to random failures and high sensitivity to targeted attacks towards high degree nodes or hubs [4, 681130]. 

This model is characterized by the exponent of the power-law (i.e., 7) which governs the degree of each node. 
The edges between the nodes are then wired in a random fashion. As we have shown earlier, the other way of 
constructing a graph that is compliant with a power-law based node degree distribution is through the growth 
of the network and preferential attachment based on node degree. A Random Graph with power-law with order 
200 and size 399 is shown in Figure [5] 

Copying Model Replicating the structure underlying the links of WWW pages brought the development of 
the Copying Model [SB] capturing the tendency of members of communities with same interests to create pages 




Figure 4: A R-MAT graph. 
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Figure 5: A Random Graph with power-law graph. 



with a similar structure of links. The basic intuition is to select a node and a number (k) of edges to add to 
the node. Then with a certain probability /3, the edges are linked independently and uniformly at random to k 
other nodes, while with probability (1 — f3) the k edges are copied from a randomly selected node u. If u has 
more than k edges, a subset is chosen, while if it has less than k edges they are anyway copied and the remaining 
are copied from another randomly chosen node. It leads to a distribution for the incoming degree that follows 
a power-law with a characteristic parameter 7j„ = jzrg- A graph based on Copying Model with order 200 and 
size 199 is shown in Figure [6] 

Forest Fire In order to capture dynamic aspects of the evolution of networks, Leskoveck et al. 61j proposed 
the Forest Fire model. The intuition is that networks tend to densify in connectivity and shrink in diameter 
(i.e., the greatest shortest path in the network) during the growth process; technological, social and information 
networks show this phenomenon in their growth process. The model requires two parameters known as forward 
burning probability (p) and backward burning ratio (r). The graph grows over time and at each discrete time 
step a node v is added, then a node w, known as ambassador, is chosen at random between the other nodes of 
the graph and a link between v and w is added. A random number x (obtained from a binormal distribution 
with mean (1 — p)^ 1 ) is chosen and this is the number of out-links of node w that are selected. Then a fraction 
r times less than the out-links is chosen between the in-links and an edge is created with these as well. The 
process continues iterating choosing a new x number for each of the nodes v is now connected to. The idea, as 
the name of the model suggests, resembles the spreading of a fire in a forest that starts from the ambassador 
node to a fraction (based on the probability parameters) of nodes it is connected to and goes on in a sort of chain 
reaction. This model leads to heavy tails both in the distribution of in-degree and out-degree node degree. In 
addition, a power-law is shown in the densification process: a new coming node tends to have most of his links 
in the community of his ambassador and just few with other nodes. A graph based on Forest Fire model with 
order 200 and size 505 is shown in Figure [7j 

Kronecker Graph A generating model with a recursive flavor similar to R-MAT uses the Kronecker product 
applied to the adjacency matrix of a graph [60] . The Kroneker product is a non conventional way of multiplying 
two matrices. 

Definition 14 (Kronecker product). Given two matrices A and B with dimension (n x m) and (n' x ml ) the 
Kronecker product between A and B is a matrix C with dimension (n ■ n' x m • m! ) with the following structure: 



C = A<E>B = 



la xl B a 12 B 

02.\B 02.2B 




B 
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Figure 6: A Copying Model graph. 




Definition 15 (Kronecker Graph). Given two graphs G and H with adjacency matrices A(G) and A(H), a 
Kronecker graph is a graph whose adjacency matrix is obtained by the Kronecker product between the adjacency 
matrices of G and H. 

If the Kronecker product is applied to the same matrix, therefore multiplying the matrix with itself in the 
Kronecker product fashion, a self similar structure arises in the graph. This situation can be seen as the increase 
of a community in a network and the further differentiation in sub-communities while the network grows. This 
model creates networks that show a densification in the connectivity of its nodes, which provides a shrinking 
diameter over time. The idea is to apply the Kronecker product to the same matrix recursively. The procedure to 
create a graph based on the Kronecker product starts with a N x N matrix where each Xij element of the matrix 
represents a probability of having an edge between node i and j. Thereafter, at each time step the network grows 
so that at step k the network has N k nodes. By applying the Kronecker product to the same matrix leads to the 
emergence of self-similar fractal-like structures at different scales. This structure mimics a quite natural process 
that is the recursive growth of communities inside communities which are a miniature copy of a big community 
(i.e., the whole graph structure) |60j . A Kronecker Graph with order 167 and size 264 is shown in Figure [81 




Figure 8: A Kronecker graph. 



5 Network metrics 

In |74j we proposed a number of metrics useful for analyzing Power Grid topologies having in mind decentralized 
energy trading. We recall them here together with new ones, which we then apply to the evolution/growth models 
presented in Section |4j We set two main categories of requirements: qualitative and quantitative desiderata the 
network should satisfy. 

Qualitative requirements 

The main qualitative requirement we envision for the future Distribution Network relies on the modularity of 
the network topology. In the power system domain, the modularity is invoked as a solution that provides 
benefits reducing uncertainties in energy demand forecasting and costs for energy generation plants as well as 
risks of technological and regulatory obsolescence [S3 05] • Modularity is usually required not only in the energy 
sector, but more generally in the design and creation of product or organizations .44] . It is also a principle 
that is promoted in innovation of complex systems |38) for the benefits it provides in terms of reduced design 
and development time, adaptation and recombination. We assess the modularity of a network as the ability of 
building the network using a self-similar recurrent approach and having a repetition of a kind of pattern in its 
structure. 
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Quantitative requirements 

As a global statistical tool, quantitative requirements are even more useful as they give a precise indication of 
network properties. Here are the relevant ones when considering efficiency, resilience and robustness of a power 
system. 

• Characteristic Path Length ( CPL) lower or equal to the natural logarithm of order of graph: CPL < ln(N). 
This requirement represents having a general limited path when moving from one node to another. In the 
Grid this provides for a network with limited losses in the paths used to transfer energy from one node to 
another. 

• Clustering Coefficient (CC) which is 5 times higher than a corresponding random graph with same order 
and size: CC > 5 x CCrg- Watts and Strogatz [94] show that small- world networks have clustering 
coefficient such that CC > CCrg. Here we require a similar condition, although less strong by putting a 
constant value of 5. This requirement is proposed in order to guarantee a local clustering among nodes since 
it is more likely that energy exchanges occur at a very local scale (e.g., neighborhood) when small-scale 
distributed energy resources are highly implemented. 

• Betweenness-related requirements: 

— A low value for average betweenness in terms of order of the graph v = jj, where a is the average 
betweenness of the graph and N is the order of the graph. For the Internet Vazquez et at [89] have 
found for this metric v ~ 2.5. Internet has proved successful to tolerate failures and attacks [231 14"]. 
therefore we require a similar value for this metric for the future Grid. 

— A coefficient of variation for betweenness c„ = = < 1 where s is the sample standard deviation and x 
is the sample mean of betweenness. Usually distribution with c„ < 1 are known as low-variance ones. 

The above two requirements are generally considered to provide network resilience by limiting the number 
of critical nodes that have a high number of minimal paths traversing them. These properties provide 
distributions of shortest paths which are more uniform among all nodes. 

• An index for robustness such that Rob^ > 0.45. Robustness is evaluated with a random removal strategy 
and a node degree-based removal strategy by computing the average of the order of the maximal connected 
component (MCC) of the graph between the two situations when the 20% of the nodes of the original graph 
are removed [74]. It can be written as Rob N = \ MCC ^«^%\+\MCC NadcDcgm20% \ Such & requiremeilt is 
about double the value observed for current Medium Voltage and 33% more for Low Voltage samples [71] . 

• A measure of the cost related to the redundancy of paths available in the network: APL 10 th < 2 x CPL. 
With this metric we consider the cost of having redundant paths available between nodes. In particular, 
we evaluate the 10 th shortest path (i.e., the shortest path when the nine best ones are not considered) by 
covering a random sample of the nodes in the network (40% of the nodes whose half represents source nodes 
and the other half represents destination nodes). The values for the paths considered are then averaged. 
In the case where there are less than ten paths available, the worst case path between the two nodes is 
considered. This last condition gives not completely significant values when applied to networks with small 
connectivity (i.e., absence of redundant paths). 



Metric 


Efficiency 


Resilience 


Robustness 


CPL 


/ 






CC 


/ 






Avg. Betweenness 




/ 




Betw. Coeff. of Variation 




/ 




ijofejv 






/ 


APL w th 


/ 


/ 





Table 2: Metrics classification related to properties delivered to the network. 

The above quantitative metrics can be categorized into three macro categories with respect to how they affect a 
Power Grid: efficiency in the transfer of energy, resilience in providing alternative path if part of the network is 
compromised/congested and robustness to failures for network connectivity. Table [2] summarized the property 
each metric assesses. 



12 



6 Generating Smart Grids 



Having presented topological models and relevant Power Grid metrics to evaluate them, it is now time to perform 
the networks generation for the purpose of assessing their quality. The baseline network for metric analysis must 
be the real current Power Grid network. For this purpose, we use actual samples from the Medium and Low 
Voltage network of the Northern Netherlands (for a complete description of the data we refer to [711 175) ). 



Network 
sample 


Model 


Order 


Size 


Avg. 
deg. 


CPL 


CC 


Removal 
robust- 
ness 

(Rob N ) 


Redundancy 
cost 

(APL 1Qth ) 


LV-Small 


Real data 


21 


22 


2.095 


4.250 


0.00000 


0.338 


12.364 


LV-Medium 


Real data 


63 


62 


1.968 


5.403 


0.00000 


0.245 


5.607 


LV-Large 


Real data 


186 


189 


2.032 


17.930 


0.00000 


0.134 


56.733 


MV-Small 


Real data 


263 


288 


2.190 


12.672 


0.01117 


0.184 


20.905 


MV-Mcdium 


Real data 


464 


499 


2.151 


13.107 


0.00035 


0.181 


18.399 


MV-Large 


Real data 


884 


1059 


2.396 


9.529 


0.00494 


0.298 


12.809 



Table 3: Metrics for Dutch Medium Voltage and Low Voltage samples. 



Table [3] summarizes the values for the network metrics applied on the Dutch network samples. We notice that the 
average degree of the Medium and Low Voltage samples scores almost constantly around < k >~ 2 independently 
of the order of the network. In the Low Voltage networks we see a tendency towards the increase of characteristic 
path length, with a value about 18 when the order and size are about 200 nodes and edges, respectively. The 
same metric does not have the same clear tendency for the Medium Voltage samples. Considering the clustering 
coefficient there is a general tendency: a null value for the Low Voltage samples and small, but at least significant, 
values for the Medium Voltage samples. These differences in both characteristic path length and clustering 
coefficient come from the difference in topology of the two networks. Low Voltage is almost a non-mashed 
network which resembles for certain samples trees or closed chains with longer paths on average, especially when 
the network grows. On the other hand, the Medium Voltage network is more meshed (despite the same average 
node degree) with more connections that act as "shortcuts" . It also has to some extent some redundancy in the 
connections between the neighborhood of a node, which implies a more significant clustering coefficient compared 
to the Low Voltage network. The analysis of the robustness metric shows generally poor scores that decrease 
while the sample increase, at least for the Low Voltage networks, while the tendency is not clear for the Medium 
Voltage samples considered. A common behavior for the Medium Voltage samples is the problem they experience 
in the biggest component connectivity, when the 20% of the nodes with the highest degree are removed from the 
network: the robustness falls to 0.0456, 0.0366 and 0.0396 respectively for the Small, Medium and Large sample. 
Considering the additional effort required when the first nine shortest paths are not available, we see a general 
increase especially for the Low Voltage samples, where the 10 th average path length (redundancy cost column in 
Table [§ increases three times for the Large sample analyzed; the increase is still present in Medium Voltage, but 
it is limited compared to the Low Voltage samples. This is again an indication that the Medium Voltage provides 
more efficient alternative paths to connect nodes. An exception in the results is the Low Voltage Medium size 
sample: here the 10 th path average path length is really close to the traditional characteristic path length. This 
is basically due to the absence of alternative paths, therefore the only paths between nodes are at the same time 
the best and worst case too. This reinforces once again the idea of a Low Voltage network with a fixed structure 
(sort of chain or tree like) and a limited redundancy. 



Network 
sample 


Model 


Order 


Size 


Avg. be- 
tweenness 


Avg. 

betw/order 


Coeff. 
varia- 
tion 


LV-Small 


Real data 


21 


22 


70.286 


3.347 


0.643 


LV-Medium 


Real data 


63 


62 


255.016 


4.048 


2.091 


LV-Large 


Real data 


186 


189 


2928.227 


15.743 


1.207 


MV-Small 


Real data 


263 


288 


1237.711 


4.706 


1.517 


MV-Mcdium 


Real data 


464 


499 


3424.602 


7.381 


1.687 


MV-Large 


Real data 


884 


1059 


7755.542 


8.773 


2.875 



Table 4: Betweenness for Dutch Medium Voltage and Low Voltage samples. 



Considering the betweenness-related metrics shown in Table |4j one notices an increase in the average betweenness 
as the samples become more numerous in the two segments of the network (i.e., Medium Voltage and Low Voltage). 
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This same tendency is present also in the average betweenness to order ratio: the biggest samples in terms of 
order both of Low Voltage and Medium Voltage score highest. In particular, the Large sample belonging to the 
Low Voltage is almost twice the value of the biggest sample of the Medium Voltage. This again can be justified 
by the similar-to-tree structure of the Low Voltage sample for which nodes responsible for the paths that enable 
sub-trees or sub-chains to be connected are the most high scoring for betweenness. This highly increases the 
average betweenness (while the mode is usually null). The coefficient of variation is above one for all the big 
samples and reaches almost three for the biggest sample belonging to the Medium Voltage network. Such a 
high value implies a high standard deviation in the betweenness of the nodes, an indication for an heavy-tail 
distribution. 

Model Parameters 

To model the future Power Grid we compare network topologies that quantitatively evolve in size and order. In 
particular, we consider the increase of average node degree (< k >— ^jf). This evolution implies new cables 
and costs. For the Random Graph, small-world, preferential attachment and R-MAT models, we consider an 
evolution in the magnitude of average node degree of « 2 then sa 4 and sa 6. The idea behind these values 
for average node degree is to study how the properties and metrics of networks change when increasing the 
connectivity of the network. 

Each of the models introduced in Section [4] is defined by a set of parameters. Let us now consider some 
meaningful values for each one of them. 

• Random Graph. For the Gn.m model, the only parameters needed are the order and size of the graph 
to be generated. We use the values shown in Table [T] for the order, and the size is chosen accordingly to 
obtain an average node degree of two, four and six, respectively. 

• Small-world Graph. In addition to order, the small-world model requires the specification of the average 
out-degree and the edge rewiring probability. For the first parameter, we simply provide a value to obtain 
the desired average node degree (i.e., < k >sa 2, < k 4 and < k >m 6). The latter parameter represents 
the probability of rewiring an edge connecting a source node to a different destination node chosen at 
random. We choose an intermediate approach between the regular lattice (i.e., rewiring probability p = 0) 
and random graph extremes (i.e., rewiring probability p — 1). In fact, we choose a rewiring probability 
p = 0.4. This is to give slightly more emphasis to the regular structure of lattice than to the rewiring, since 
we expect the future Grid to have more emphasis on a regular structure than random cabling. This last 
aspect also helps to satisfy the qualitative requirement of modularity. 

• Preferential Attachment. For the creation of a graph based on growth and preferential attachment model 
of Barabasi- Albert [8], the only parameters needed are the order and size of the graph to be generated. We 
use the values shown in Table [T] for order parameter, while size parameter is chosen accordingly to obtain 
an average node degree of two, four and six, respectively. 

• R-MAT. The R-MAT model requires several parameters. First of all, order and size of the network, then 
the a, o, c, d parameters which represent the probabilities of the presence of an edge in a certain partition of 
the adjacency matrix. The order of the graph is chosen so that the nodes are a power of two, in particular, 
2 n where usually n — \log2N~\. Therefore, we consider for this model the following values for the order: 
{32,128,256} for comparison with the Low Voltage, and {256,512,1024} for comparison with the Medium 
Voltage Grids. For the probability parameters, since we have an undirected graph, we have b = c, in 
addition the ratio found between a and 6, as in many real scenarios according to [21] , is about 3:1. We 
assume a more highly connected community (a = 0.46) and a less connected community (d = 0.22) and a 
relative smaller connectivity between the two communities (b = c = 0.16). 

• Copying Model. The copying model requires, in addition to the order of the graph, a value for the 
probability of copying (or not) edges from existing nodes. (1 — /?) is the probability of copying nodes from 
another node. In the present study, we fix (3 — 0.2 so as to have a high probability of having a direct 
(just one-hop since with probability 0.8 each new node copies the connections of another node and attaches 
directly to them) connection to what might be considered the most reliable energy sources present in the 
city or villages (at Medium Voltage level), while it represents single users or small aggregation of users with 
high energy capacity at Low Voltage level. 

• Forest Fire. The Forest Fire model requires, in addition to the order of the graph, two values representing 
the probability of forward and backward spread of the "burning fire" . We choose the same value for both 
probabilities since our graph is not directed. To avoid a flooding of edges, we choose few small values to 
assign to forward and backward probability (pf w d = Pbwd = 0.2; p fwd = p bwd = 0.3; p fwd = Pbwd = 0.35) 
that give realistic amounts of edges incident to a node on average and that can be compared with the 
models for which one is able to directly set order and size. 
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• Random Graph with Power-law. For the model representing Random Graph with power-law in node 
degree distribution, the parameters required are essentially the order of the network and the characteristic 
parameter of the power-law (known as the 7 coefficient). For the first parameter, we use the usual dimensions 
(see Table [T]) , while for the latter some additional considerations are necessary. We test different types 
of power-law coefficients characterizing real technological networks. For the non-electrical technological 
networks (i.e., technological networks which are not Power Grids) we average the values of the power-law 
characteristic parameter described in [23] ; the details of the parameters are shown in Table [5] For the 
Power Grid networks the 7 values represent: 

— the findings for the Western and Eastern High Voltage U.S. Power Grid in [22]; the values are averaged 
to have a single 7, the details are shown in Table [6] 

— the findings for the High Voltage U.S. Western Power Grid in [5] which reports a value 7 = 4; 

— the findings for the Medium and Low Voltage Dutch Grid that follow a power-law in [73]; the values 
are averaged to have a single 7, the details are shown in Table [7] 



Type of network 


7 


Internet degree 


2.12 


Telephone calls received 


2.09 


Blackouts 


2.3 


Email address book size 


3.5 


Hits to web-sites 


1.81 


Links to web sites 


2.336 


Average 


2.359 



Table 5: Power-law 7 parameters for technological networks [2"4] . 



Type of network 


7 


Eastern Interconnection 


3.04 


Western System 


3.09 


Average 


3.065 



Table 6: Power-law 7 parameters for High Voltage U.S. 
Power Grid [22]. 

• Kronecker Graph. For the Kronecker model, the required parameter is the initial dimension of the square 
matrix to apply the Kronecker product: a 2 x 2 initiation matrix is a good starting model [60] . Once the 
structure of the matrix is defined the initial parameters for the generation matrix need to be evaluated. 
With a 2 x 2 adjacency matrix for the initial graph G: 

the parameters can be interpreted in a similar fashion as in R-MAT: 

— a models the "core" part of the network and the tightness of its connectivity. 

— d models the "perifery" part of the network and the connectivity inside it. 

— b, c model the relationships and interconnections between the core and the periphery. 

The findings of Leskovec et al. [60] applying the Kronecker modeling to many different networks report a 
common recurrent structure for the parameters of the 2x2 Kronecker matrix initiator. In particular, the 



Type of network 


7 


LV#5 


2.402 


LV#10 


1.494 


MV#2 


1.977 


MV#3 


2.282 


Average 


2.039 



Table 7: Power-law 7 parameters for Dutch Medium and Low Voltage Power Grid [73] ■ 
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Type of network 


2x2 Kronecker generator parameters 




a 


c 


b 


d 


Social-technological 


0.9578 


0.4617 


0.4623 


0.3162 


Power Grid 


0.4547 


0.8276 


0.8504 


0.0186 



Table 8: Probability parameters for the 2x2 Kronecker matrix. 



parameters tend to follow the empirical rule a ^> b > c ^> d and are usually a ~ 1, b « c « 0.6 and d « 0.2. 
In this work, we consider two sets of parameters characterizing the Kronecker initiator matrix. The first 
set is extracted and averaged from the technological and social networks parameters extracted from real 
sample data [50] . The second set of parameters is obtained applying the fitting procedure to a Kronecker 
graph to the UCTE High Voltage Power Grid data set used in [S0J IE3], the High Voltage U.S. Western 
Power Grid data set used in [§3] , and the Medium and Low Voltage samples data set used in [73] ■ All these 
values have been averaged to obtain just one 2x2 Kronecker generation matrix. A summary of the values 
for the parameters of the Kronecker matrix used is given in Table [8] One notices a very different structure 
in the matrix parameters between the social and other diverse technological networks and the Power Grid. 



Network 
type 


Model 


Order 


Size 


Avg. 
deg. 


CPL 


CC 


Removal 
robust- 
ness 

(Rob N ) 


Redundancy 
cost 

(APL 1Qth ) 


LV-Small 


Small-world 


20 


20 


2.000 


4.053 


0.00000 


0.330 


7.580 


LV-Mcdium 


Small-world 


90 


90 


2.000 


11.820 


0.01593 


0.167 


12.932 


LV-Large 


Small-world 


200 


201 


2.010 


17.397 


0.01083 


0.109 


21.544 


MV-Small 


Small-world 


250 


250 


2.000 


24.237 


0.00000 


0.087 


24.534 


MV-Medium 


Small-world 


500 


501 


2.004 


28.084 


0.00000 


0.057 


35.413 


MV-Large 


Small-world 


1000 


1001 


2.002 


47.077 


0.00000 


0.040 


60.074 


LV-Small 


Preferential 
attachment 


20 


19 


1.900 


2.579 


0.00000 


0.349 


2.800 


LV-Medium 


Preferential 
attachment 


90 


89 


1.978 


4.315 


0.00000 


0.263 


4.471 


LV-Large 


Preferential 
attachment 


200 


199 


1.990 


6.523 


0.00000 


0.206 


6.375 


MV-Small 


Preferential 
attachment 


250 


249 


1.992 


5.426 


0.00000 


0.245 


5.570 


MV-Medium 


Preferential 
attachment 


500 


499 


1.996 


5.705 


0.00000 


0.231 


5.745 


MV-Large 


Preferential 
attachment 


1000 


999 


1.998 


6.976 


0.00000 


0.187 


6.908 


LV-Small 


Random 
Graph 


17 


21 


2.471 


2.938 


0.07451 


0.390 


7.472 


LV-Medium 


Random 
Graph 


78 


92 


2.359 


5.987 


0.03547 


0.418 


10.974 


LV-Large 


Random 
Graph 


172 


207 


2.407 


6.254 


0.00736 


0.354 


10.796 


MV-Small 


Random 
Graph 


224 


259 


2.313 


7.269 


0.00000 


0.322 


12.002 


MV-Medium 


Random 
Graph 


435 


516 


2.372 


8.380 


0.00138 


0.321 


12.818 


MV-Large 


Random 
Graph 


863 


1026 


2.378 


9.061 


0.00070 


0.328 


13.446 


LV-Small 


R-MAT 


27 


31 


2.296 


3.615 


0.00000 


0.356 


7.830 


LV-Medium 


R-MAT 


88 


125 


2.841 


4.115 


0.05688 


0.369 


6.418 


LV-Large 


R-MAT 


199 


261 


2.623 


5.495 


0.00737 


0.364 


8.774 


MV-Small 


R-MAT 


195 


263 


2.697 


5.629 


0.00865 


0.378 


8.642 


MV-Medium 


R-MAT 


365 


523 


2.866 


5.470 


0.01360 


0.396 


7.646 


MV-Large 


R-MAT 


728 


1056 


2.901 


5.726 


0.00589 


0.363 


7.887 



Table 9: Metrics for small-world, preferential attachment, Random Graph and R-MAT models with average node 
degree ~ 2. 



Model Generation 

Given the parameters presented above, we generate the graphs with respect to the different models and analyze 
them according to the significant Power Grid metrics described in Section [5] We begin with the models for which 
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1 work 


lVTodpl 


Ofdcf 


Size 


Avg. foe- 


Aver 


Coeff. 


type 








tweenness 


betw/order 


varia- 
tion 


LV-Small 


Small-world 


20 


20 


62.300 


3.115 


0.804 


lj v -ivicQiuni 


S mall- world 


on 
yu 


on 
yu 


yoo.yoD 


1 n o^r 


l.OU i 


LV- Large 


Small-world 


200 


201 


3429.720 


17.149 


1.260 


ivi v-omall 


S mall- world 


zou 




RQQ1 Oflfi 


OQ COR 


1 ROC 


MV-Mcdium 


Small-world 


500 


501 


13980.228 


27.960 


1.745 


MV-Largc 


Small- world 


1000 


1001 


A 7fl 1ft Clfi 

4/yiy.olb 


a v non 
4/ .921) 


2.279 


LV-Small 


Preferential 
attachment 


20 


19 


31.400 


1.570 


2.344 


LV-Medium 


Preferential 
attachment 


90 


89 


293.400 


3.260 


3.068 


LV- Large 


Preferential 
attachment 


200 


199 


lUoy-ZDU 


5.446 


3.288 


ivi v-ornall 


Preferential 
attachment 






1 noft 1 A A 
lUyo.144 


A ooc; 


o.y ( Z 


ivi v -ivieu.iu.iii 


Preferential 
attachment 


ouu 


499 


Z4U1.UOU 






MV-Largc 


Preferential 
attachment 


1000 


999 


6061.288 


6.061 


6.240 


LV-Small 


Random 
Graph 


17 


21 


31.059 


1.827 


1.157 


LV-Mcdium 


Random 
Graph 


78 


92 


408.308 


5.235 


1.126 


LV-Large 


Random 
Graph 


172 


207 


938.512 


5.456 


1.276 


MV-Small 


Random 
Graph 


224 


259 


1474.143 


6.581 


1.265 


MV-Mcdium 


Random 
Graph 


435 


516 


3415.890 


7.853 


1.204 


MV-Largc 


Random 
Graph 


863 


1026 


7081.119 


8.205 


1.264 


LV-Small 


R-MAT 


27 


31 


70.593 


2.615 


1.320 


LV-Medium 


R-MAT 


88 


125 


282.500 


3.210 


1.540 


LV-Largc 


R-MAT 


199 


261 


937.578 


4.711 


1.297 


MV-Small 


R-MAT 


195 


263 


959.118 


4.919 


1.395 


MV-Mcdium 


R-MAT 


365 


523 


1692.910 


4.638 


1.581 


MV-Largc 


R-MAT 


728 


1056 


3633.473 


4.991 


2.004 



Table 10: Betweenness metrics for small-world, preferential attachment, Random Graph and R-MAT models 
with average node degree ~ 2. 
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Network 
type 


Model 


Order 


Size 


Avg. 
deg. 


CPL 


CC 


Removal 
robust- 
ness 
(Rob N ) 


Redundancy 
cost 

(APL 1Qth ) 


LV-Small 


Small-world 


20 


39 


3.900 


2.289 


0.26000 


0.721 


4.720 


LV-Medium 


Small-world 


90 


177 


3.933 


3.652 


0.14646 


0.780 


6.032 


LV-Largc 


Small-world 


200 


399 


3.990 


4.407 


0.15367 


0.767 


6.631 


MV-Small 


Small-world 


250 


498 


3.984 


4.566 


0.12581 


0.779 


6.836 


MV-Mcdium 


Small-world 


500 


1000 


4.000 


5.067 


0.10681 


0.764 


7.231 


MV-Large 


Small-world 


1000 


1998 


3.996 


5.749 


0.10879 


0.781 


7.910 


LV-Small 


Preferential 
attachment 


20 


37 


3.700 


2.263 


0.47341 


0.554 


4.380 


LV-Medium 


Preferential 
attachment 


90 


177 


3.933 


2.910 


0.11216 


0.426 


4.788 


LV-Largc 


Preferential 
attachment 


200 


397 


3.970 


3.322 


0.09566 


0.448 


5.047 


MV-Small 


Preferential 
attachment 


250 


497 


3.976 


3.504 


0.08400 


0.419 


4.998 


MV-Mcdium 


Preferential 
attachment 


500 


997 


3.988 


3.687 


0.03929 


0.401 


5.232 


MV-Large 


Preferential 
attachment 


1000 


1997 


3.994 


4.211 


0.01536 


0.401 


5.678 


LV-Small 


Random 
Graph 


20 


40 


4.000 


2.079 


0.17667 


0.733 


4.350 


LV-Medium 


Random 
Graph 


87 


180 


4.138 


3.174 


0.03418 


0.735 


5.368 


LV-Largc 


Random 
Graph 


199 


400 


4.020 


3.869 


0.03064 


0.734 


6.107 


MV-Small 


Random 
Graph 


247 


500 


4.049 


4.057 


0.01681 


0.740 


6.432 


MV-Medium 


Random 
Graph 


494 


1000 


4.049 


4.495 


0.00823 


0.749 


6.670 


MV-Largc 


Random 
Graph 


987 


2001 


4.055 


5.062 


0.00359 


0.738 


7.150 


LV-Small 


R-MAT 


30 


59 


3.933 


2.517 


0.27360 


0.579 


4.511 


LV-Medium 


R-MAT 


105 


250 


4.762 


3.019 


0.13039 


0.581 


4.490 


LV-Largc 


R-MAT 


227 


504 


4.441 


3.619 


0.04683 


0.601 


5.302 


MV-Small 


R-MAT 


230 


496 


4.313 


3.736 


0.02940 


0.626 


5.381 


MV-Mcdium 


R-MAT 


420 


1004 


4.781 


3.915 


0.00450 


0.591 


5.249 


MV-Largc 


R-MAT 


932 


2039 


4.376 


4.562 


0.00875 


0.690 


6.251 



Table 11: Metrics for small- world, preferential attachment, Random Graph and R-MAT models with average 
node degree « 4. 
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0.404 


MV-Largc 


Preferential 
attachment 


1000 


1997 


3179.750 


3.180 


3.450 


LV-Small 


Random 
Graph 


20 


40 


23.600 


1.180 


0.807 


LV-Mcdium 


Random 
Graph 


87 


180 


196.345 


2.257 


0.850 


LV-Large 


Random 
Graph 


199 


400 


589.849 


2.964 


0.889 


MV-Small 


Random 
Graph 


247 


500 


766.389 


3.103 


0.857 


MV-Mcdium 


Random 
Graph 


494 


1000 


1768.757 


3.580 


0.972 


MV-Largc 


Random 
Graph 


987 


2001 


4068.393 


4.122 


0.942 


LV-Small 


R-MAT 


30 


59 


44.000 


1.467 


1.342 


LV-Medium 


R-MAT 


105 


250 


223.733 


2.131 


1.695 


LV-Largc 


R-MAT 


227 


504 


609.419 


2.685 


1.493 


MV-Small 


R-MAT 


230 


496 


650.374 


2.828 


1.468 


MV-Mcdium 


R-MAT 


420 


1004 


1285.786 


3.061 


1.652 


MV-Largc 


R-MAT 


932 


2039 


3422.348 


3.672 


1.506 



Table 12: Betweenness metrics for small-world, preferential attachment, Random Graph and R-MAT models 
with average node degree ~ 4. 
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Network 
type 


Model 


Order 


Size 


Avg. 
deg. 


CPL 


CC 


Removal 
robust- 
ness 
(Rob N ) 


Redundancy 
cost 

(APL 1Qth ) 


LV-Small 


Small-world 


20 


59 


5.900 


1.816 


0.33250 


0.775 


3.470 


LV-Medium 


Small-world 


90 


266 


5.911 


2.809 


0.20131 


0.794 


4.508 


LV-Largc 


Small-world 


200 


598 


5.980 


3.324 


0.13596 


0.797 


4.895 


MV-Small 


Small-world 


250 


747 


5.976 


3.486 


0.14477 


0.798 


5.039 


MV-Mcdium 


Small-world 


500 


1494 


5.976 


3.968 


0.14477 


0.799 


5.518 


MV-Large 


Small-world 


1000 


2996 


5.992 


4.429 


0.14854 


0.797 


5.905 
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54 


5.400 


1.868 
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0.742 


3.933 
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5.940 
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0.08772 
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4.130 
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250 
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5.952 


2.926 


0.08676 


0.705 


4.257 
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5.980 
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0.05017 


0.667 


4.481 


MV-Large 
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0.03335 
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4.664 
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20 


60 


6.000 
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0.29599 


0.775 


3.370 


LV-Medium 


Random 
Graph 


90 


270 


6.000 


2.640 


0.06987 


0.791 


4.298 


LV-Largc 


Random 
Graph 


200 


600 


6.000 


3.141 


0.03991 


0.777 


4.693 


MV-Small 


Random 
Graph 


249 


750 


6.024 


3.230 


0.01934 


0.793 


4.884 


MV-Medium 


Random 
Graph 


499 


1500 


6.012 


3.620 


0.00976 


0.792 


5.284 


MV-Largc 


Random 
Graph 


998 


3000 


6.012 


4.022 


0.00544 


0.791 


5.662 


LV-Small 


R-MAT 


32 


87 


5.438 


2.194 


0.21179 


0.760 


3.945 


LV-Medium 


R-MAT 


123 


374 


6.081 


2.926 


0.08173 


0.717 


4.377 


LV-Largc 


R-MAT 


249 


759 


6.096 


3.165 


0.04444 


0.736 


4.622 


MV-Small 


R-MAT 


236 


747 


6.331 


3.143 


0.04982 


0.746 


4.389 


MV-Mcdium 


R-MAT 


466 


1512 


6.489 


3.427 


0.04365 


0.743 


4.805 


MV-Largc 


R-MAT 


925 


3035 


6.562 


3.742 


0.02560 


0.723 


4.925 



Table 13: Metrics for small-world, preferential attachment, Random Graph and R-MAT models with average 
node degree w 6. 
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2039 


2599.496 


2.810 


1.731 



Table 14: Betweenness-related metrics for small-world, preferential attachment, Random Graph and R-MAT 
models with average node degree « 6. 
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it is possible to explicitly assign order and size (or one of these quantities and the average node degree) ; we then 
proceed analyzing the other models that do not explicitly allow to set the average node degree parameter. 



Model generation implementation and metrics computation 

The values and graphs of generated topologies are obtained using software applications for network generation 
and analysis. In particular, for the model generation we developed C++ programs using the Stanford Network 
Analysis Project (SNAP) mttpT77snap.stanford.edu/) library that enables the generation of the network 
topologies described in Section |4] and the assignment of the parameters described earlier in this section. The 
analysis of the generated graphs according to the metrics described in Section[5]is performed with had-hoc created 
software based on the JAVA graph library JGraphT (http://www.jgrapht.org/). The only metric computed 
with the SNAP software is 'betweenness' whose computation is based on the algorithm developed by Brandes [T3]. 
To perform the generation and computation of the metrics we used a PC with Intel Core2 Quad CPU Q9400 
2.66GHz with 4GB RAM. The Operating system is based on the Linux kernel 2.6.32 with a 4.4.3 GCC compiler 
and JAVA framework 1.6. The versions of SNAP and JGraphT software libraries used are respectively vlO. 10.01 
and vO.8.1. 



Comparison of models with average node degree < k 2 

The results for the metrics with average degree < k >« 2 for the small-world, preferential attachment, Random 
Graph and R-MAT models score quite poorly, cf. Table [9] These low values are due to the small connectivity 
the networks show. Especially, we highlight the poor results of the small-world model under these conditions: 
with such a small average degree, the characteristic path length tends to be very high particularly as the network 
grows, with a value that for the biggest network generated is higher than 45 and about 60 for the 10 th path 
measure. In such a graph with small amount of edges, the clustering coefficient is also affected: the neighbors of 
a node are not organized in tight clusters because the numbers of links available are limited, only the Random 
Graph and R-MAT have non zero values for some samples, although few. The robustness to failure is limited 
under these conditions, with the worst case corresponding to the small-world samples, while better results are 
shown by preferential attachment, Random Graph and R-MAT models. The last two score higher than 0.33 for 
this metric. Considering the cost of redundancy, we generally see an increase in the characteristic path length 
as the order of the graph grows; the best results arc shown by the networks generated with the preferential 
attachment model that presents values close to the best ones of the average path length. This might be due to 
the absence of many redundant paths in such a loosely connected network (less than 10 shortest paths without 
cycles) between any two nodes. A graphical comparison for the results of the Large sample for the Medium 
Voltage type considering characteristic path length, clustering coefficient and robustness metrics are given in 
Figure [9] 



The betweenness analysis, whose results are presented in Table 10 shows an average for each node that 
increases with the size of the graph. The difference is in the value of average betweenness for the small-world 
model compared to other models: for the largest networks (500 and 1000 nodes) the value is almost one order 
of magnitude higher. This is due to the lattice structure of small-world that with a < k >« 2 degenerates in 
a long "closed-chain" topology which involves many nodes. The amount of edges that provide a "shortcut" in 
the graph is limited. This is in line with the high characteristic path length just described. The R-MAT model 
scores well considering the desiderata we imposed for average betweenness order ratio and coefficient of variation; 
the former is below 5 even for the biggest sample and the latter stays below 2. For the small-world sample, we 
experience a small coefficient of variation which reinforces the result indicating that almost all nodes have the 
same high betweenness close to the average. A graphical comparison for the results of the Large sample for the 
Medium Voltage type considering average betweenness order ratio and coefficient of variation metrics are shown 
in Figure [T0| 



Comparison of models with average node degree < k >ss 4 

Table [XT] shows the results for small-world, preferential attachment, Random Graph and R-MAT models with an 
average degree < k >m 4. One notices the scores for the metrics improve compared to the < k 2 case. The 
average over the characteristic path length of all the samples reduces from around 10 to a value that is slightly 
less than 5. The clustering coefficient has values that are significant and all positive. The small-world model 
scores best in this specific metric since it relies on the lattice topology that with an average degree of 4 connects 
each node with four neighbors. In particular 3 triangle structures emerge in each neighborhood of a node. This 
provides a substantial contribution to the quite high clustering coefficient. Generally, all models score higher 
than the random graph with respect to the clustering coefficient (one of our desired properties). The addition of 
links provides enhanced robustness for the network too. Generally the order of the biggest connected component 
is about 63% of the initial order of the network (averaging all the result for the models) while with a < k >« 2 
networks the value is just 27%. Not surprisingly, the best scores for robustness are obtained by the Random 
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Figure 9: Metrics for the Large sample of Medium Voltage network type with average node degree ~ 2. 
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Figure 10: Metrics for the Large sample of Medium Voltage network type with average node degree ~ 2. 
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Figure 11: Results for metrics for the Large sample of Medium Voltage network type with average node degree 
« 4. 



Graph model since in this type of graph the nodes tend to have the same characteristics, and hubs are not present 
in the network. Quantitatively quite similar results to Random Graph for robustness are shown by small-world 
graphs (for some samples the metric scores even higher) with a robustness that is close to 0.8. Preferential 
attachment and R-MAT models score lower than random and small-world models with values around 0.45 for the 
former and 0.6 for the latter (in both cases these values are almost double than those for the < k 2 case). An 
explanation for this lower score compared to other models for preferential attachment and R-MAT models resides 
in their building properties: they admit the presence of hubs (the node degree distribution is characterized by a 
power-law) that are highly sensitive for network robustness when targeted for removal. Considering the cost for 
the redundancy related to alternative paths, lower values appear for the preferential attachment model followed 
by the R-MAT and slightly higher for Random Graph and small-world. The worst case for this last model is 
a little smaller than 8 which is anyway only increased by 2 compared to the characteristic path length for the 
same sample. A graphical comparison for the results of the Large sample for Medium Voltage type considering 
characteristic path length, clustering coefficient and robustness metrics are shown in Figure [TT| 

Analyzing the betweenness we see a general improvement in the metrics compared to the < k >s=s 2 case, 
cf. Table |12| The most important improvement is for the small-world model which, with approximately 4 
connections per node, substantially reduces the average betweenness by a factor of 10 compared to the < k 2 
case. Although the small-world model performs worse than other models for the average betweenness order ratio, 
the coefficient of variation performs the best. It reinforces the idea that is in the model itself: nodes that do 
not differ much in their properties (the underlying lattice structure) have a small variation in the betweenness of 
nodes. The preferential attachment and R-MAT models, which generate networks with a fraction of nodes that 
have a very high connectivity due to the power-law in the node degree distribution, reach a higher coefficient of 
variation for betweenness. A graphical comparison for the results of the Large sample for Medium Voltage type 



considering average betweenness order ratio and coefficient of variation metrics are shown in Figure 12 



Comparison of models with average node degree < k >ss 6 

Table [i~3] shows the results for small- world, preferential attachment, Random Graph and R-MAT models with 
an average degree < k >« 6. The scores for the metrics considered improve even more with respect to those of 
Tables [9] and 11 The characteristic path length of all the samples has reduced to a value that, considering the 



average over all the samples with < k >« 6, is about 3; yet 2 hops lower than the situation with < k >rj 4. The 
same tendency for clustering coefficient found for samples in Table [TT] applies to this situation too. The small- 
world model scores highest since the neighbors of a node have nine connections with each other, thus substantially 
contributing to a high coefficient. For the R-MAT and preferential attachment models the clustering coefficient 
decreases as the order of the graph increases, but still for the biggest sample generated (1000 nodes) it is about one 
order of magnitude higher than a corresponding random graph. It is interesting to highlight how the clustering 
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Figure 12: Results for metrics for the Large sample of Medium Voltage network type with average node degree 
« 4. 
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Figure 13: Metrics for the Large sample of Medium Voltage network type with average node degree ~ 6. 



coefficient for the small-world model tends to stabilize over the 0.14 value for the biggest samples (250, 500 and 
1000 nodes, respectively). Regarding robustness, on average it increases to a value higher than 0.75. However it 
worths to notice how the increment mainly involves the preferential attachment and the R-MAT models which 
improve respectively from 0.44 and 0.61 to 0.70 and 0.73, on average. Therefore, the additional connectivity is 
more beneficial to power-law distributions than the others which seem to have already hit the upper bound for 
this metric with the < k >^ 4 situation. The cost of the redundant paths with this enhanced connectivity is 
reduced even more and on average the 10 th shortest path is just 1.5 hops higher than the characteristic path 
length for the same network. A graphical comparison for the results of the Large sample for Medium Voltage 
type considering characteristic path length, clustering coefficient and robustness metrics are shown in Figure p~3j 
Having increased the average degree to 6 brings benefits to the betweenness statistics too, cf. Table [l4| The 
benefits on the average betweenness order ratio are about 25% higher compared to the < k >^ 4 situation; this 
ratio therefore is now very close to the experimental values that have been found for the Internet (i.e., ~ 2.5) which 
is one of our desiderata. The preferential attachment model, especially scores lower than the Internet threshold 
value for all the categories of samples considered. As already mentioned for the samples with < k >^ 4, the 
coefficient of variation for betweenness, even in this < k >^ 6 situation, scores best for the non power-law 
topologies (i.e., small- world and Random Graph) that show a value below the unit for all the dimensions of 
samples considered. The improvement for this metric for preferential attachment and R-MAT models are present 
but limited, in fact, they score higher than 3 and 1.7, respectively, in the worst case. A graphical comparison 
for the results of the Large sample for Medium Voltage type considering average betweenness order ratio and 
coefficient of variation metrics are shown in Figure |14| 
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Figure 14: Metrics for the Large sample of Medium Voltage network type with average node degree ~ 6. 



Models Independent from the Average Node Degree 

The Copying, Forest Fire, and Kronecker models are not generated using explicitly the average node degree, cf. 
Section [6j Therefore, we consider the Power Grid metrics on them separately. We remark however that, though 
not explicitly used as input parameter, the average node degree of the generated graphs has similar values to those 
of Random Graphs, small-world and preferential attachment models generated with the same order. Tables [15] 
and [IT] contain the results for the metrics analyzed, while Tables [16] and [T8] contain the results for betweenness. 

The Copying Model results are comparable with the values for the metrics analyzed in Table [9] since the 
way the model is created provides a constant average node degree < k >« 2. One can see that the Copying Model, 
which leads to a power-law in the node degree distribution, scores better than the small-world and preferential 
attachment ones in characteristic path length and robustness. The small cost in the average 10 th shortest path 
is due to the computation of the worst case path (in fact in these conditions of network connectivity there are 
not ten paths between two nodes) which due to the very small meshed structure that is created it admits in 
the majority of cases just one path. With the way the model is implemented in the simulation environment 
used (Stanford Network Analysis Platform - SNAFQ, a node just copies a link from another chosen node and 
therefore there is no possibility to generate "triangle" structures between nodes which are essential to have a 
non-null clustering coefficient; that is why this metrics has such score. Considering the results for betweenness 
and comparing the values for Copying Model with the results obtained for models with imposing < k >« 2, the 
Copying Model scores with an average smaller betweenness, this translates into a betweenness to order ratio that 
is better than other samples. On the other hand, the coefficient of variation is quite high given the difference in 
betweenness: extremely high only for few nodes in the network that sustain the majority of the shortest paths, 
while the majority of the nodes participates only in the shortest paths for which they are end nodes for the path. 
The statistical mode for the betweenness values of each category for Copying Model is in fact null. 

For the Forest Fire model, we assign different forward and backward burning probabilities to obtain values 
for the average degree to some extent comparable with the other models. The model with pf wc i — Pbwd = 0.2 
can be compared to models with < k 2. The Forest Fire scores definitely better than all the others in 
clustering coefficient. This is not surprising, if one recalls the algorithm behind the model: an ambassador 
node is chosen and with a certain probability a certain number of ambassador's neighbors nodes are chosen to 
establish link to. One can see how many triangle-like structures tend to appear from such a generating method. 
The same observations can be done for the Forest Fire with pf wc i = Pbwd = 0.3 when compared to models with 
< k >« 4: the characteristic path length scores almost like the other models, while this model suffers deeply in 
the robustness metric which for the biggest samples obtain a score which is half compared to the other generating 
models with < k 4. This is due to the very high damages imposed to network connectivity when high degree 
nodes are removed: for the biggest sample (order of about 1000 nodes), when the 20% of nodes with highest 
degree are removed, the biggest connected component is just 2% of the original graph order. This is typical of 
heavy-tailed distributions which Forest Fire models empirically [61]. The metric that scores best is again the 
clustering coefficient that is three times higher (for the biggest sample) than the already quite high value of the 
small- world model. Even when wc consider denser Forest Fire networks (i.e., Pf w d = Pbwd — 0.35) the comparison 
with the model with < k 6 brings to the same conclusions: far better clustering coefficient, but an important 
weakness to node removal. Betweenness for the Forest Fire model shows a known trend when varying the average 
node degree, the more the networks becomes connected the better the metrics related to betweenness become. 
For the samples with a burning probability of Pf w d = Pbwd = 0.35, the betweenness to order ratio stays below 
3. The same behavior applies to the coefficient of variation, although it generally scores worse than the samples 
already analyzed with similar average degree. 

The results shown by the networks generated with the Kronecker model using the parameters extracted 
from the Power Grid networks show metrics values similar to the ones computed from the physical samples 

^http : //snap . Stanford. edu/ 
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with almost equal order. Especially the parameters for the Power Grid create networks with an average node 
degree, characteristic path length and robustness that mimic what we found for the current Dutch Medium 
and Low Voltage samples. Even the very low clustering coefficient (very often down to zero) is something we 
already recorded in real Power Grid samples [71]. When the networks generated with parameters extracted 
from Power Grid are compared with the networks generated from social and technological networks, one sees 
a general improvement in all the metrics under analysis: a reduction of a couple of hops in the characteristic 
path length, a higher clustering coefficient which is similar to the values obtained for random graphs. Generally 
the social-technological based Kronecker networks score more than 30% better than the corresponding based on 
Power Grid parameters for characteristic path length. We also see how the networks based on the Kronecker 
model show an almost constant, or decreasing value, for the average \Q th path and a characteristic path length 
that very slowly increases with the order of the network. To some extent this tendency is something that the 
Kronecker model aims to achieve: densification of the network over time, i.e., when more nodes become part of 
the network the effective diameter of the networks becomes smaller. Considering betweenness, one sees a smaller 
average betweenness for the networks based on technological and social parameters than the ones generated with 
Power Grid parameters. Despite quite high values for both average betweenness and standard deviation, the 
samples produced with Power Grid parameters have a smaller coefficient of variation compared to the techno- 
social networks. The comparison of Kronecker models with networks generated with < k >« 2 models shows 
better values of the former compared to small-world and preferential attachment models while the results are 
quite similar for Random Graph and R-MAT models. 

Considering the results of Random Graph with Power-law models, there is a difference for the networks 
generated with smaller 7 parameters (i.e., Medium and Low Voltage Dutch Grid 7 w 2 and social and technological 
networks 7 w 2.3) which score better than the ones with higher 7 (i.e., U.S. Eastern Interconnect and Western 
Grid 7«3 and U.S. Western Power Grid 7 « 4). The first two sets of samples show a denser network with 
higher average node degree, almost double compared to the other two sets, this results in a beneficial behavior 
for the metrics computed which present a smaller characteristic path length. This set of networks with small 
7 is comparable for the characteristic path length property to the values obtained for networks generated with 
< k >« 4. The second set of samples (i.e., higher 7 parameter) shows results that are similar to the ones obtained 
for samples generated with < k >s» 2. A general property that applies to all these power-law based samples is 
the problem they suffer, as already mentioned, from targeted attack involving the nodes with high degree, which 
justifies very poor scores for robustness metric. The betweenness analysis for the power-law based models shows 
an average betweenness value that is smaller for the networks with a lower value for the 7 coefficient so that 
they score best in the betweenness to order ratio. On the other hand, a lower 7 implies a higher probability in 
the presence of nodes that have higher node degree; usually there is quite a good positive correlation between 
the node degree and the betweenness the nodes have to sustain (high degree implies high betweenness for that 
node) . It is therefore understandable why the coefficient of variation is higher for the networks characterized by 
a low 7 than the ones with higher power-law characteristic parameter. 

A graphical comparison of the results for networks without explicit dependence to average node degree for 
the Large sample for Medium Voltage type considering characteristic path length, clustering coefficient and 
robustness metrics are shown in Figure |15[ while a summary of the results for betweenness for the same sample 



are illustrated in Figure 16 



Comparing The Generated Topologies with the Physical Ones 

The analysis of the Northern Netherlands Grid shows an average degree almost constant of about < k >« 2. 
Thus, it is fair to compare the generated models with similar average degree, the Copying Model ones and the 
Random Graphs with power-law in node degree distribution based on the data of Eastern and Western High 
Voltage U.S. Power Grid and the U.S. Western High Voltage Power Grid since all generate networks with average 
node degree < k >« 2. Generated models, except the model based on Random Graph with power-law, score 
better than the physical topologies for all the metrics considered; the characteristic path length scores half for 
the R-MAT and Copying Model cases in comparison to the real data. Also synthetic networks are more robust 
than the real data samples: R-MAT and Random Graph score constantly above 0.3 for robustness metric while 
real data hardly obtain this value. Clustering coefficients are quite similar since in this configuration with limited 
connectivity having triangle structures in the network is rare, however we see that R-MAT model has almost 
always significant clustering coefficient values. An exception is the small-world model which scores almost always 
worse than the real data samples, in fact, under this situation of limited average node degree it is actually not 
fully correct to consider this synthetic topology a "small-world" . The same sort of considerations can be done 
considering betweenness values: except the small-world model all the other synthetic ones score better for the 
average betweenness to order ratio metric, while for the coefficient of variation the situation is similar. If one 
considers the satisfaction of the desiderata for the actual samples of the Dutch Medium and Low Voltage Grid, 
summarized in Table [19] we notice that all parameters are not satisfied. However, networks generated according 



to the models with almost the same average node degree (networks with < k >ss 2 in Table 20 and networks 
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Figure 16: Metrics for the Large sample of Medium Voltage network type for models independent from node 
degree. 
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based on Random Graph with power-law based on data from Eastern and Western High Voltage U.S. Power Grid 



and the U.S. Western High Voltage Power Grid in Table 22) do not satisfy all the desiderata as well. Therefore 



this highlights that the first ingredient for the next generation of Grids to enable local energy exchange is an 
increase connectivity. 



Desiderata 


Northern Netherlands 
Medium and Low Voltage samples 


Modularity 


X 


CPL < ln(N) 


X 


CC > 5 x CC RG 


X 


U=£«2.5 


X 


c„ < 1 


X 


Rob N > 0.45 


X 


APL 1Qth < 2 x CPL 





Table 19: Desiderata parameter compliance of real Medium and Low Voltage network samples of Northern 
Netherlands Grid. 



Increasing the average node degree naturally provides for better values for the metrics, as shown in Table 20 
The case of the small-world model is emblematic. While the < k >« 2 case scores extremely poor as there are not 
enough "shortcuts" in the networkso that they can not improve much the characteristic path length. Actually 
under such small average degree the condition Watts and Strogatz impose for their model is not completely 
satisfied (i.e., n 3> k ^> ln(n) S> 1, where k is the average node degree and n is the order of the graph). When 
we move closer to satisfying the small-world condition by increasing the average node degree, the value of the 
metrics suddenly changes and the models score extremely high. The small-world scores best for the clustering 
property and resilience to failures in < k 4 situations. Under these conditions also the betweenness values 
are quite concentrated around the mean with a coefficient of variation that does not exceed the unit. 

Comparing the average values of the generated models for increasing node degree, one notices a natural 



improvement of the metrics, cf. Table 21 In fact, we have a reduction in characteristic path length of about 60% 
and an increase in the clustering coefficient of one order of magnitude, at the same time the robustness doubles. 
With < k >R3 6 the improvement compared to the metrics is less prominent, being between 10% and 20%. From 
the comparison of the metric results in Table [20] one sees that the small- world model almost always satisfies the 
desiderata requirement from a quantitative point of view when the average node degree is at least 4. From a 
qualitative point of view, the small-world model shows to some extent certain characters of modularity being 
generated starting from a regular lattice and then rewiring a certain fraction of the edges. 

The models independent form average node degree perform generally worse than the other models in satisfying 
the desiderata values for the Power Grid metrics. The adherence to the target values are shown in Table [22] one 
sees the general prevalence of requirement dissatisfaction, especially parameters involving betweenness are never 
satisfied by these generated samples. 

From the topological analysis one can see that between the models analyzed when there is a minimal connec- 
tivity (< k >R3 4 or < k >« 6) the small-world stands out, cf. Table 20 In Table 23 the models with explicit 
dependence on node degree are once again compared by assigning a "tick" sign (/) for the fulfillment of each of 
the following properties: qualitative topological parameters (i.e., modularity), quantitative topological parame- 
ters (Table 20) and the thrift in network realization (e.g., addition of cables which represent a cost). This last 
parameter given is just a rough estimation, a more detailed analysis of cost in realizing a network belonging to 
Medium or Low Voltage with a certain size (i.e., Small, Medium or Large) and the economic benefits in electricity 
distribution arising from the enhanced connectivity is provided in Section [7j From Table [23] we conclude that 
networks generated with small-world model with average degree < k 4 provide the the best balance to satisfy 
the desiderata of the future Power Grid. 



7 Economic Considerations 



Traditionally the problem of evaluating the expansion of an electrical system is a complex task that involves 
both the use of modeling, usually based on operation research optimization techniques and linear program- 
ming (42] [59] [10] , and the experience and vision of experts in the field supported by computer systems. In this 
latter case computers acquire knowledge based on previous experts' decision and, based on the electrical physical 
constraints of the domain, are then able to support Power Grid evolution decision [55] finding the most suitable 
technical and economical solution. With more distributed generating facilities at local scale, traditional methods 
have limits and need to be modified or updated to take into account the new scenario the Smart Grid framework 
brings into play. The models that we have so far analyzed as being candidates for the vision of the future Smart 
Grid need also to be evaluated from the economic point of view. How much will it cost to generate electrical 
infrastructures according to these models? What is the actual cost of adding a physical edge to the topology? 
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Avg. node degree transition 


Average metric improvement (%) 




CPL 


CC 


Robustness 


< k 2 -^< k 4 


61.7 


941.6 


128.5 


< k >« 4 -K fc >« 6 


18.0 


11.8 


19.6 



Table 21: Comparison of generated topologies for varying average node degree. 



The Cost of Adding Edges 

One important difference that a physical infrastructure such as the Power Grid has compared to the WWW 
or social networks is the physical presence of cables that have to connect the Medium Voltage substations or 
Low Voltage end-users generating units. If establishing a link from a Web page to another one is free, on the 
other hand, each increase in connectivity in the Power Grid implies costs in order to adequate the substation or 
end-user premise involved and the cables required for the connection. To assess these cost in the Medium and 
Low Voltage infrastructure, we consider a simple relation where the cost of cabling and cost of substations are 
added: 

N M 

Ci m p — ^ ' Sscj + S Cci (1) 

3=1 1=1 

where C'i mp i stands for cost for implementation, Sscj is the adaptation cost for the substation j and Cci is the 
cost for the cable i. The cost of the cable can be expressed as a linear function of the distance the cable i 
covers: Cci — C uc ■ I where C uc is the cable cost per unit of length and I is the length of the cable. Several 
types of cables exist which are used for power transmission and distribution with varying physical characteristics 
and costs, in addition also the cost for installation can vary significantly [7T]. In the present work, though, we 
simply consider cabling costs and ignore substation ones. While the former are directly tied to the topology and 
length of the links, the latter pricing is too dependent on other factors. As a source of data for cable type and 
pricing, we have been provided (courtesy of Enexis B.V.) with cables characteristics and prices together with 
topological information for 11 network samples belonging to the Low Voltage network and 13 samples belonging 
to the Medium Voltage of the Northern Netherlands. 



Statistical consideration over cables' price 

Extracting probability distributions of physical and price data out of North Netherlands data samples, shows 
interesting correlations. The length of the cables plays an important role for both total resistance and price. 
If one considers the correlation between the price and resistance, high values are found. Using Spearman's 



rank correlation coefficient [54 , shown in Table 24 one can evaluate to what extent the variation tendency 
characterizing two variables can be described by a monotonic relationship. In other words, one has an indication 
of the correlation between price and resistance. Especially, for generating synthetic networks it is important to 
obtain values for both the properties of the cables that are similar to the ones actually used in practice. Plotting 
the two variables characterizing each cable one notices that the majority of the samples concentrates in the lower 
tails for the joint distribution. Figures [17] and [18] show the relation between the price and resistance where the 
values concentrate in the lower corner of the price x resistance space. 



In the chart in Figure 18 one notices the two distinct lines that deviate from the low-left corner. They 
represent the two main types of cables that are used in that sample of the Low Voltage network to cover different 
distances and that result in increasing in price and resistance when longer lines are realized. This opens a new 
perspective: evaluate for each type of cable used in a certain sample (Small, Medium and Large) how the length of 
the cables used are distributed. In fact, given a certain type of cable and its length all other interesting properties 
for our analysis are then available (i.e., cable total resistance, cable total cost and cable current supported). 

A general tendency appears when fitting the distribution of lengths to cable types belonging to Low Voltage 
and Medium Voltage: a fast decay in lengths' probability distribution with the majority of lengths for the Low 
Voltage cables types in the order of tens of meters, and Medium Voltage cables hundreds of meters. Fitting the 
length to a statistical probability distribution gives a good approximation for the Low Voltage cable lengths as 

exponential distributions (y = fx(x; fi) — ^e^r). Figures 19jand |2l| show respectively the cumulative distribution 
probability and the probability density functions for a certain type of cable belonging to the Low Voltage network. 
The use of the Kolmogorov-Smirnov test [55] lets us accept the hypothesis in favor of this distribution. The 
situation is slightly different for the Medium Voltage cables where the distribution that generally fits best the data 
is the generalized extreme value distribution (y = fx(x;k, fi,a) = — (1 + fc^r^) 1 _¥ exp{ — (1 + fc^ 11 ^) - n }); even 
in this case the Kolmogorov-Smirnov test supports this hypothesis. A graphical representation of the probability 
cumulative distribution function and the probability density function per cable type of to the Medium Voltage 
network are shown in Figures [20] and [22] respectively. 
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Figure 17: Price-Resistance pairs joint plot for the Medium Voltage Small size sample. 
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Figure 18: Price-Resistance pairs joint plot for the Low Voltage large size sample. 
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Network Model 


Avg. node degree < k >ss 2 


Avg. node degree < k >ss 4 


Avg. node degree < k >ss 6 


Small-world 


// 




// 


Preferential Attachment 


/ 




✓ 


Random Graph 


/ 




✓ 


R-MAT 









Table 23: Summary table considering satisfaction of modularity, performance and cabling cost for generated 
models with node degree < k 2. 4, 6. 



Sample type 


Spearman's rank correla- 
tion Price- Resistance 


Low Voltage - Small 


0.962 


Low Voltage - Medium 


0.974 


Low Voltage - Large 


0.937 


Medium Voltage - Small 


0.787 


Medium Voltage - Medium 


0.634 


Medium Voltage - Large 


0.946 



Table 24: Spearman's rank correlation for Low Voltage and Medium Voltage representative samples. 



Assume that, statistically speaking, the distribution of the lengths for each type of cable in the synthetic 
networks are the same as in the real samples. Therefore, knowing the probability of using a certain type of cable 
i (Pcabiei = Y^ C -#cabie k wnere ^cablei is the number of occurrences of cable type i in a certain network sample) 
that has a certain cost and resistance per meter and a specific current supported, it is then possible to estimate 
the cables that are used in the synthetic samples together with their properties. 

Economic Benefits of Highly Connected Topologies 

Once the information about cable prices is available, it is possible to estimate the cost for realizing a network 
with a certain connectivity and if such networks are able to lower the (economic) barrier towards decentralized 
trading. The results for Low Voltage networks with an average node degree < k >« 2 are shown in Table [25} 
The results for < k >ss 4 and < k >ss 6 are about two and three times more expensive since there is an increase 



in the number of edges by the same quantity; for completeness the results are summarized in Tables 26 and 27 
For Medium Voltage networks, it is important to clarify that the information available for cables' prices in this 
study are only partial and limited to some technologies (only few cross sections of aluminum and copper cables). 
Anyway, in order to have a glimpse of costs for this type of the network, we fitted to the best interpolating curve 
the available prices as a function of the cross section. The relation between price and cross section for aluminum 
cables fits best to a cubic polynomial, while for the copper ones is linear; in this way we can have an estimation 
for the prices for all the types of cables involved knowing their cross section. The results for the networks with 



an average node degree < k >« 2 are shown in Table 28 The results for < k >« 4 and < k >« 6 are just 



two and three times more expensive since there is an increase in the number of edges by these same factors; for 
completeness, the results are shown in Tables [29| and |30) The small difference in costs between the Medium and 
Large types of networks is related mainly to the different technologies (i.e., cable types) in the cables that are 
used for these types of networks. 

Price alone is not enough to describe future scenarios. It is important to investigate how an enhanced 
connectivity is beneficial to the electricity distribution costs. We have provided the benefits for more connected 
networks at the beginning of this section, however those results consider only the topology without any parameter 
related to the properties of the cables (e.g., resistance and supported current). In order to consider the effects of 
topology in electricity distribution costs, we have developed a set of metrics that associate topological properties 
of Power Grid networks to costs in electricity distribution. We have applied these metrics in the analysis of the 
Medium and Low Voltage Grid of the Northern Netherlands in [74]. In order to apply these metrics to Power 
Grid networks weights are essential, representing physical quantities such as resistance of the cable and maximal 



Sample type 


Size 


Cost (thousand euro) 


Low Voltage - Small 


« 20 


« 30 


Low Voltage - Medium 


« 90 


« 78 


Low Voltage - Large 


« 200 


« 449 



Table 25: Cabling cost for < k 2 synthetic samples for Low Voltage networks. 
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Figure 19: Cumulative distribution function for cable length (meters) for cable type "VMvK(h)as 4x150 al" in 
Northern Netherlands sample Low Voltage size Large. 
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Figure 20: Cumulative distribution function for cable length (meters) for cable type "3xlx70al" in Northern 
Netherlands Medium Voltage sample size Medium. 



Sample type 


Size 


Cost (thousand euro) 


Low Voltage - Small 


ps 40 


ps 51 


Low Voltage - Medium 


ps 180 


« 174 


Low Voltage - Large 


ss 400 


ps 827 



Table 26: Cabling cost for < k >~ 4 synthetic samples for Low Voltage networks. 



39 




50 100 150 200 250 300 350 

x (cable length) 

Figure 21: Probability density function for cable length (meters) for cable type "VMvK(h)as 4x150 al" in 
Northern Netherlands sample Low Voltage size Large. 
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Figure 22: Probability density function for cable length (meters) for cable type "3xlx70al" in Northern Nether- 
lands Medium Voltage sample size Medium. 



Sample type 


Size 


Cost (thousand euro) 


Low Voltage - Small 


« 60 


« 76 


Low Voltage - Medium 


fa 270 


254 


Low Voltage - Large 


« 600 


« 1239 



Table 27: Cabling cost for < k >« 6 synthetic samples for Low Voltage networks. 
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Sample type 


Size 


Cost (millions euro) 


Low Voltage - Small 


ps 250 


ps 32 


Low Voltage - Medium 


w 500 


ps 42 


Low Voltage - Large 


ps 1000 


ps 43 



Table 28: Cabling cost for < k >ps 2 synthetic samples for Medium Voltage networks. 



Sample type 


Size 


Cost (millions euro) 


Low Voltage - Small 


ps 500 


ps 55 


Low Voltage - Medium 


ps 1000 


« 88 


Low Voltage - Large 


ps 2000 


ps 86 



Table 29: Cabling cost for < k >ps 4 synthetic samples for Medium Voltage networks. 



operating current supported by the cable. Once we have the statistical information about the types and the 
length of the cables used in a specific type of physical network, (i.e., Medium or Low Voltage and its Small, 
Medium or Large size) it is possible to assign weights to the edges of the generated graphs. This is done under 
the assumption that the same type of cables are used and that the distances covered in general (i.e., statistically) 
remain the same. In |74j we proposed to consider two types of measures that influence electricity price: one 
related to the dissipation and losses aspects on the Grid called a, and a second one which takes into account the 
aspects of reliability in the network called B. Formally, these measures have the following expressions: 



B = f(Robpf, RedN, Capx)- (3) 

In equation [2] the factors influencing a are the losses happening in the electrical lines (Lu neN ) and the losses 
arising at substations (L su b stationN ). On the other hand, the parameters influencing 6 consider the robustness of 
the network to failures (Robif), the loss experienced in following redundant paths between nodes (Red^) and the 
available capacity in the lines connecting nodes of the network (Cap at). The relationship between a and B and 
the price of electricity is considered quadratic as other components (e.g., fuel price) influencing electricity price 
hold this relationship [48j . Considering how a and B parameters are computed [74], the smaller the values for 



each, the smaller the impact topology has on electricity prices (see Appendix B I . For completeness the essential 



information about topology and electricity cost-related metrics are more thoroughly explained in |Appcndix B 



Figure 23 shows the values for the a and j3 metrics for the synthetic networks generated following the small- 
world model with an increasing average node degree (< k >ps 2, < k >« 4 and < k >« 6). It is not surprising 
to see the samples with < k >« 2 score poorer than the other networks. The networks with higher average 
node degree are better visualized in Figure |24| One sees how the network with Medium size scores best and the 
difference between the network with < k >ss 6 and the network with < k >ps 4 is limited. Robustness (i.e., B 
parameter) for the Medium and Large size networks reaches a high value just with a sufficient connectivity (i.e., 
< k >sa 4) and more connectivity (i.e., < k >« 6) does not improve much this metric. The samples with Small 
size score better in the a metric and this is quite reasonable since the paths, especially in terms of the number 
of substations traveled in the shortest path, are limited, of course due to the reduced order of the network. 

The a and B metrics for the networks generated for Medium Voltage purposes are shown in Figure [25j The 
same tendency appears: once the network is sufficiently connected (i.e., < k >rs 4) the metrics score definitely 



better than the < k >« 2 situation. If we dig into the most connected samples (Figure 26), we see how the 
values are quite concentrated with the exception of the Large sample with < k >rs 4. It is interesting to see 
the change in the a value once there are more links: the value of the metric almost halves with an increase of 
connectivity i.e., < k >«s 6 situation. 

Let us compare the a and 8 metrics of the synthetic networks with the values of the real Power Grid samples 
of the Northern Netherlands. Considering the Low Voltage samples and the synthetic networks designed for 



Sample type 


Size 


Cost (millions euro) 


Low Voltage - Small 


ps 750 


« 80 


Low Voltage - Medium 


ps 1500 


ps 132 


Low Voltage - Large 


« 3000 


ps 131 



Table 30: Cabling cost for < k >« 6 synthetic samples for Medium Voltage networks. 
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Electricity cost based on topological properties for Low Voltage generated samples 




Figure 23: Transport cost of electricity based on the topological properties for synthetic networks based on 
small-world model for Low Voltage Grid. 



Electricity cost based on topological properties for Low Voltage generated samples 




Figure 24: Transport cost of electricity based on the topological properties for synthetic networks based on 
small- world model for Low Voltage Grid (detail of < k >« 4 and < k >sa 6 samples). 
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Electricity cost based on topological properties tor Medium Voltage generated samples 




Figure 25: Transport cost of electricity based on the topological properties for synthetic networks based on 
small-world model for Medium Voltage Grid. 



Electricity cost based on topological properties for Medium Voltage generated samples 




Figure 26: Transport cost of electricity based on the topological properties for synthetic networks based on 
small- world model for Medium Voltage Grid (detail of < k >~ 4 and < k >« 6 samples). 
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Electricity cost based on topological properties for Low Voltage samples 




this purpose, we generally see an improvement in the metrics especially in the a values for the < k 4 and 

< k >rj 6 networks. In fact, if we do not consider the synthetic networks with < k >« 2, because of the 
problems of small-world topology with such small connectivity, there is an improvement on average in the a 
metric of more than 50% comparing the Northern Netherlands samples with the < k 4 synthetic ones. In 
fact, for the a metric from an average of about 13 for the physical samples, the < k >sa 4 synthetic ones score 
about 6. The improvement is more than 60% when considering the < k >« 6 ones where the average for these 
synthetic networks scores just below 5. There are improvements also in the /3 metric, although limited. From an 
average around 4 for the physical samples the < k 4 on average score just below 2.75; while a better result 
is obtained by < k 6 which on average score 2.30 (about 40% improvement). The graphical comparison is 
shown in Figure [27j 

Taking into account the Medium Voltage Netherlands samples and the small-world synthetic networks for 
this purpose, we see an important improvement in the metrics both in the a and /3 values for the < k 4 and 

< k >« 6 networks. As already mentioned, synthetic networks with < k >« 2 should not be considered. The 
improvement on average in a metric is more than 65% comparing to the < k 4 synthetic samples (from an 
average of a about 33 for the physical samples, the < k >ss 4 synthetic ones score about 11), and an improvement 
of more than 75% when comparing to the < k >« 6 ones (the average a for < k 6 synthetic networks scores 
around 7.3). There are improvements also in the /? metric. In particular, from an average around 3.55 for the 
physical samples the < k 4 score on average just below 1.15; a similar result is obtained by < k >sa 6 which 



on average score about 1.2 (more than 65% improvement). The graphical comparison is shown in Figure 28 



Discussion 

Watts and Strogatz's small- world model, as shown in Tables [20| [22] and [23} is the model that captures best the 
requirements for the new Grid compared to the other ones analyzed being these dependent on the average node 
degree (preferential attachment, R-MAT and Random Graph) or not (Copying Model, Forest Fire, Kronecker 
and Power Laws) . The tighten clustering that this models exhibits provides efficient local distribution with paths 
that are locally short; at the same time the shortcuts between the local clusters are the elements that keep the 
average path extremely limited. These two aspects influence the a parameter which then stays relatively limited. 
At the same time, the small-world model benefits from a general robustness against failures: the absence of big 
hubs that keep the network together (which are present on the other hand in the power-law-based topologies, for 
instance) improves the reliability against attacks which help obtaining good scores for the j3 parameter. More 
quantitatively, one sees the general improvement in the metrics characterizing both the parameters influencing 
the losses (i.e., a parameter) and the reliability of the Grid (i.e., f3 parameter) while the network becomes more 
dense, i.e., more edges are added. On average, we see an improvement of at least 50% when comparing the 
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Electricity cost based on topological properties for Medium Voltage samples 




Figure 28: Comparison for transport cost between synthetic small- world networks (black dots) and Northern 
Netherlands Medium Voltage samples (red dots). 

physical samples of Northern Netherlands with the small-world networks with an average degree < k >sa 4, while 
better results are obtained with more density (i.e., < k >s=s 6) where the improvement are 60% compared to 
the physical samples. This is indeed beneficial to the Power Grid and, according to the relationship with the 
topology, it should translate into a reduction in the costs for electricity distribution. 

These benefits come literally at a cost. The network needs more connectivity therefore costs for extra cabling 
need to be considered in addition to the cost for upgrading the substations and end-users electricity gateways. 
A return on investment analysis on this aspect is beyond the scope of the present study. Nevertheless, it is 
interesting to see how with the a and /3 metrics it is possible to consider how a certain physical sample belonging 
to a certain size category (Small, Medium and Large) would improve in its performance if its topology is arranged 
according to the principles of a synthetic model and more connections are added accordingly. 

The benefits reached for a and (3 should translate into a reduction in the cost for electricity transport and 
distribution since the parameters that influence these metrics are directly connected to aspects related to costs. 
However the significant investment required to add more connectivity in the network might not immediately 
enable cheaper electricity costs, but on the contrary make it more expensive. 

8 Related Work 

Complex Network Analysis works take into account the Power Grid at the High Voltage level usually to analyze the 
structure of the network without considering in detail the physical properties of the power lines. In our previous 
work [73] , we have analyzed several works that investigate Power Grid properties using Complex Network Analysis 
approach. There are two main categories: 1) understand the intrinsic property of Grid topologies and compare 
them to other types of networks assessing the existence of properties such as small- world or scale- free [5] [931 IMI 18] : 
2) better understand the behavior of the network when failures occur (i.e., edge or node removal) and analyze 
the topological causes that bring to black-out spread and cascading failures of power lines [84] [2] [31] . Few studies 
in the Complex Network Analysis landscape consider the possibility of using the insight gained through the 
analysis to help the design. These few cases consider the addition of lines in the network to assess the increase 
in the reliability of the entire Power Grid. Examples are the study of the Italian High Voltage Grid [35] and the 
study of improvement by line addition in Italian, French and Spanish Grids |81j . Also Holmgren [50j uses the 
Complex Network Analysis to understand which Grid improvement strategies are most beneficial showing the 
different improvement of typical Complex Network Analysis metrics (e.g., path length, average degree, clustering 
coefficient, network connectivity) although in a very simple small graph (less than 10 nodes) when different edges 
and nodes are added to the network. Wider is the work of Mei et al. [55] where a self-evolution process of 
the High Voltage Grid is studied with Complex Network Analysis methodologies. The model for Power Grid 
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expansion considers an evolution of the network where power plants and substations are connected in a "local- 
world" topology through new transmission lines; overall the Power Grid reaches in its evolution the small-world 
topology after few-steps of the expansion process. Wang et al. [9TJ [92] study the Power Grid to understand 
the kind of communication system needed to support the decentralized control required by the the new Power 
Grid applying Complex Network Analysis techniques. The analyses aim at generating samples using random 
topologies based on uniform and Poisson probability distributions and a random topology with small-world 
network features. The simulation results are compared to the real samples of U.S. Power Grid and synthetic 
reference models belonging to the IEEE literature. These works also investigate the property of the physical 
impedance to assign to the generated Grid samples. Complex Network Analysis is not generally used as a design 
tool to propose new topologies for the future Smart Grid as we use in this paper where we also assess the benefits 
in terms of economical improvement. 

Traditionally power system engineers adopt techniques which are different from Complex Network Analysis 
although sometimes exploiting graph theory principles [90l [29] ■ The traditional techniques applied by Power 
Engineers involve the individuation of an objective function representing the cost of the power flow along a 
certain line which is then subject to physical and energy balance. This problem translates in an operation 
research problem. These models are applied both for the High Voltage planning [42J [52] and the Medium and 
Low Voltage [HOI US] since long time. Not only operation research, but also expert systems are developed to 
help in the process of designing grounding stations based on physical requirements as well as heuristic approaches 
from engineering experience. The substation grounding issue is approached as an optimization problem of 
construction and conductor costs subject to the constraints of technical and safety parameters, its solution is 
investigated through a random walk search algorithm [45] . In [40] a pragmatic approach using sensitivity analysis 
is applied to a linear model of load flow related to various overloading situations and a contingency analysis (N- 
1 and N-2 redundancy conditions) is performed with different grades of uncertainty in medium and long term 
scenarios. In the practice the planning and expansion problem is even more complex since it implies power plants, 
transmission lines, substations and Distribution Grid. In [46] all these aspects are assessed separately and several 
challenges appear. For instance in the planning of a High Voltage overhead transmission line specific clearance 
code must be followed and not only load is a key element, but also topography and weather/climatic (above all 
wind and ice) conditions play an import role in the planning of the infrastructure. For substation planning the 
authors of [46] emphasize, in addition to the need for upgrading the Grid (e.g., load growth, system stability) 
and budgeting aspects, the multidisciplinary aspects which involve from environmental and civil to electrical 
and communications engineering. A more general approach proposed in [46] to deal with power system planning 
might be regarded as a multi-objective (e.g., economics, environment, feasibility, safety) decision problem thus 
requiring the tools typical of decision analysis [55] . 

The works mentioned so far take into account mainly the High Voltage end of the Grid while not least 
important is the Distribution Grid especially in the vision of the future electrical system as proposed in this work 
where the end-user plays a vital role. The integrated planning of Medium and Low Voltage networks is tackled 
by Paiva et al. [76] who emphasize the need of considering the two networks together to obtain a sensible optimal 
planning. The problem is modeled as a mixed integer-linear programming one considering an objective function 
for investment, maintenance, operation and losses costs that need to be minimized satisfying the constraints of 
energy balance and equipment physical limits. 

Even more challenges to Electrical system planning is posed by the change in the energy landscape with 
several companies running different aspects of the business (generation, transmission, distribution). In addition, 
accommodating more players in the wholesale market transmission expansion should follow (as it is already for 
generation) a market based approach i.e., the demand forces of the market and its forecast should trigger the 
expansion of the Grid |15j . The same consideration regarding the need of a different approach in planning in a 
deregulated market are expressed in 83 where an optimization of an objective function in the market environment 
is applied. Another method to evaluate transmission expansion plan takes into account the probability reliability 
criteria of Loss Of Load Expectation (LOLE); in particular, in 23 an objective function is proposed that takes 
into account the cost in constructing a transmission line between all buses involved in the line which is then 
subject to constrains in peak load demand satisfaction and a certain level of LOLE that the line should not 
outrun. 

In the Smart Grid framework the planning techniques might be revised especially for the Distribution Grid 
which is the segment that is likely to face the greatest changes due to the presence of Advanced Metering 
Infrastructure (i.e., bidirectional intelligent digital meters at customer location) and Distribution Automation 
(i.e., feeders can be monitored, controlled in automated way through two-way communication). In addition, 
the Medium and Low Voltage Grid is no longer a layer where only energy is consumed, but Distributed Energy 
Generation facilities (small-scale photovoltaic systems and small-wind turbines) will be attached to this segment 
of the Grid; altogether these elements are likely to reshape the way planning for Medium and Low Voltage is 
realized [IT] . 
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9 Conclusions 



In an evolving electricity sector with end-users able to produce their own energy and sell it on a local-scale 
market, the Grid plays the essential enabling role of supporting infrastructure. Local scale energy exchange is in 
fact beneficial for several aspects such as the increase in renewable-based energy production, the possibility for 
the end-user to have an economic contentment by selling surplus energy and, not less important, a step forward 
to the unbundling of the electricity sector. We studied how different topologies inspired from technological and 
social network studies have varying properties and can be (or not) adequate for the future Smart Grid networks. 
We showed that between the various models analyzed, the small-world model appears to have many supporting 
characteristics, according to a set of topological metrics defined for Power Grids. We also showed how these 
topological benefits can be related to economical aspects of electricity distribution through an improvement in 
the a and ji parameters. We also performed a statistical investigation related to cables' properties used in 
Medium and Low Voltage samples to evaluate the cost of cables to be used to realize synthetic networks in order 
to estimate the investment required for such networks. The benefits reached through topological properties are 
significant and beneficial to enable a local energy exchange, however the quantification from and economic point 
of view is not easy due to the high investment in realizing a more connected Medium and Low Voltage Grid. 

The underlying motivation for the present work, is to develop decision support techniques based on Complex 
Network Analysis metrics to upgrade the Power Grid to a Smart Grid and to assess the current infrastructures. 
In addition, it enables to predict how a change in the topology, according to a certain network model, can be 
beneficial for the network from an efficiency, resilience and robustness perspective. Finally, the approach enables 
to quantify how the topology can help in reducing the parameters influencing electricity costs while considering 
the evolution of the Medium and Low Voltage Power Grid network into an infrastructure to support Smart Grid. 



47 



Appendix A Price and Resistance Distribution 



In Section [7] we illustrate how physical properties of cables and their prices have a correlation. A proper 
analysis must then follow a bivariate approach. However, one might be interested in studying only one charac- 
teristic of cables (e.g., price) separated from others (e.g., resistance). Here we investigate what is the statistical 
distribution of these separate properties of cables. In particular we look for the presence of power-laws since they 
appear in several natural phenomena and man-created infrastructures [24] . 

Fitting the data regarding prices to the most likely distribution obtained from the three different sample sizes 
of the Dutch Low Voltage Grid gives usually distributions very concentrated towards small price values and only 
very few cables have very high prices (i.e., more than 10000 euros) which are particularly long, or their technology 
is extremely expensive. The distribution that best fits the data for the Low Voltage samples is the Log-Normal 

distribution (y = fx(x; fi 7 a) = _^^- e ^ ). Comparing the fitted distribution with the original empirical 
cumulative distribution function of the data provides significant p- values with the Kolmogorov-Smirnov test. 
The analytical parameters for the fitted distribution obtained through log-likelihood estimation are shown in 
Table M 



Sample type 


Log-Normal distribution parameters 




M 


a 


Low Voltage - Small 


6.104 


1.513 


Low Voltage - Medium 


6.075 


1.397 


Low Voltage - Large 


6.939 


1.258 



Table 31: Log-Normal distribution /i and a parameters for cable price distribution for Low Voltage samples. 

Information related to prices for Medium Voltage cables are only partially available for this study and limited 
to some technologies and cross-sections of aluminum and copper cables. In order to have an estimate of costs, 
we fit the prices available to the best interpolating curve. For the aluminum cables we used a cubic polynomial, 
while for the copper ones a linear relation between price and cross-section. We performed the same probability 
distribution fitting procedure with the Medium Voltage samples. Also in this case, the distributions that best 
approximate the sample data show a "fat-tail" behavior. For the three representative classes of samples, we 
consider that the best approximation is given by the theoretical distribution of generalized extreme value (y = 
fx(x; k, n, a) = -(l + fc^ :z ^) _1 ~fcexp{— (l + k^—^)"^}). The sample distributions have significant p-values with 
the Kolmogorov-Smirnov test indicating to accept the hypothesis of this underlined probability law for the Small 
and Medium classes of samples. The Large class sample poses more problems since the p- value resulting form the 
test is under the 5% acceptance threshold. Although the test suggests to reject the hypothesis of the underlying 
distribution, we consider it anyway a good distribution approximation since this type of distribution is the one 
that has a p- value closer to significance compared to others distribution tested (e.g., log-normal, exponential, 
Gaussian). The analytical parameters for the fitted distribution obtained through log- likelihood estimation are 
shown in Table [32l 

Similar statistical considerations can be applied for fitting the resistance characterizing the cables. The 
obtained distributions both for Low Voltage and Medium Voltage reference networks present once again a "fat- 
tail" characteristic since, although the most of the cables have a small resistance properties, there are some cables 
with far higher resistance properties. The distributions that best fit the data are either generalized extreme values 

(y = fx(x;k,fi,a) = I(l + fc^)-i-*exp{-(l + fc*=*i)-i}) or log-normal (y = fx(x;»,a) = —^=e~^ rt ). 
The parameters are shown in Tables [33| and [34] for Low Voltage and Medium Voltage respectively. 

Both price and resistance distributions present usually a high probability that is concentrated in the lower 
values, however there are overall small, but highly significant in terms of their values, contributions in the tail 
of the distribution. We perform also an investigation considering if this "heavy-tail" contribution have power- 
law properties. We apply the fitting techniques proposed by Clauset et al. |24j to understand the presence of 
significant power-law contributions in these distributions. From this analysis it appears that there are marks of 



Sample type 


Extreme value distribution parameters 




k 


a 


A* 


Medium Voltage - Small 


0.547 


33082.4 


31988.8 


Medium Voltage - Medium 


0.419 


32569.4 


35880.8 


Medium Voltage - Large 


0.490 


16925.2 


16766.9 



Table 32: Extreme values distribution k, fx and a parameters to fit cable price distribution for Medium Voltage 
samples. 
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Sample type 


Distribution type 


Distribution parameters 






k 




a 


Low Voltage - Small 


Log- normal 




-2.27846 


1.97188 


Low Voltage - Medium 


Generalized 
extreme values 


0.994657 


0.054877 


0.058296 


Low Voltage - Large 


Log- normal 




-0.881168 


1.25617 



Table 33: Distribution parameters to fit cable resistance for Low Voltage samples. 



Sample type 


Extreme value distribution parameters 




k 


a 




Medium Voltage - Small 


1.09862 


4.1366 


2.96819 


Medium Voltage - Medium 


0.613803 


3.35594 


3.41663 


Medium Voltage - Large 


0.619069 


3.59693 


3.35337 



Table 34: Extreme values distribution k, jJt and a parameters to fit cable resistance for Medium Voltage samples. 



power-law distribution in both the probability of cable prices and cable resistance. These power-law contributions 
are generally significant in the middle part of the distribution, while the very initial part of the distribution and 
the final part of the tail tend to deviate from the power-law rule. In fact, the p-value that characterizes the 
Kolmogorov-Smirnov test is generally higher than the 5% null hypothesis rejection for the power-law hypothesis 
in the central part of the distribution. Two examples for the Low Voltage and Medium Voltage samples are given 
in Figures 29 and 30 related to cable resistance, while Figures [31] and [32] are related to cable prices. Each figure 
represents the cumulative probability distribution (complementary) on double logarithmic scale where the blue 
circles represent the samples data, the red line is the best fitting probability distribution over the whole sample 
(described above) and the black dashed line represents the best fitting power-law distribution in the interval of 
the sample closer to power-law. 



Appendix B Relating Topological Properties to Economical Distri- 
bution Benefits 

In Section [7] we introduce the concept to associate Grid topology and cost of electricity. Here we give a thorough 
explanation of these concepts based on the findings in our previous work |74j . where we developed a set of 
metrics to relate topological aspects and electricity cost and applied it to existing Dutch Medium and Low 
Voltage infrastructure. As described in Section]?] we take advantage of that proposal and apply the same metrics 
to the generated topologies suitable for the Smart Grid. The goal is to consider from a topological perspective 
those measures that are critical in contributing to the cost of electricity as elements in the Transmission and 
Distribution Networks as described in economic studies such as the one of Harris and Munasinghe [48] [70] : 

• losses both in line and at transformer stations, 

• security and capacity factors, 

• line redundancy, and 

• power transfer limits. 

The topological aspects that we consider provide two sorts of measures, the first one a gives an average of 
the dissipation in the transmission between two nodes 

^ / \Llinepj t ^substations) j (4) 

the second one (3 is a measure of reliability /redundancy in the paths among any two nodes 

/3 = /(i?o6jv, Red,N, Colpn)- (5) 
The functions to explicitly compute a and f3 parameters can be expressed as follows: 

• Losses on the transmission/distribution line can be expressed by the quotient of the weighted characteristic 
path length and the average weight of a line (a weighted edge in the graph): 

Lhne N — ~ (6) 

W 
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X 



Figure 29: Cumulative probability distribution (complementary) for cable resistance M-size sample Low Voltage 
network (double logarithmic scale). 




Figure 30: Cumulative probability distribution (complementary) for cable resistance M-size sample Medium 
Voltage network (double logarithmic scale). 
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Figure 31: Cumulative probability distribution (complementary) for cable price M-size sample Low Voltage 
network (double logarithmic scale). 




Figure 32: Cumulative probability distribution (complementary) for cable price M-size sample Medium Voltage 
network (double logarithmic scale). 
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• Losses at substation level are expressed as the number of nodes (on average) that are traversed when 
computing the weighted shortest path between all the nodes in the network: 



J substation n 



= Nodes WCPLN (7) 



• Robustness is evaluated with random removal strategy and the weighted-node-degree-based removal by 
computing the average of the order of maximal connected component between the two situations when the 
20% of the nodes of the original graph are removed. It can be written as: 

\MCCR an dom20% \ + \MCC 'NodeDegree20%\ /cA 

Rob N = - (8) 

• Redundancy is evaluated by covering a random sample of the nodes in the network (40% of the nodes 
whose half represents source nodes and the other half represents destination nodes) and computing for 
each source and destination pair the first ten shortest paths of increasing length. If there are less than ten 
paths available, the worst case path between the two nodes is considered. To have a measure of how these 
resilient paths have an increment in transportation cost, a normalization with the weighted characteristic 
path length is performed. We formalized it as: 

V SP 

7-, i £—'i£Sources'j£Sinks w ij / n \ 

RedN = WCPL (9) 

• Network capacity is considered as the value of the weighted characteristic path length, whose weights are 
the maximal operating current supported, normalized by the average weight of the edges in the network 
(average current supported by a line). That is: 

WCPL current N 

Cap N = ^ (10) 

^current 



With these instantiations, equations Q and ([5| become: 



f (-L[i neN , L suos t a ti onN ^ — Lline^ L substation^ (H) 

= f(Red N , Rob N ,Ca PN ) = - , (12) 

The aspects here considered are just some of the factors (the ones closely coupled to topology) that influ- 
ence the overall price of electricity. Naturally, there are other factors that influence the final price, e.g., fuel 
prices, government policies and taxation, etc., as illustrated for instance in the economic studies of Harris and 
Munasinghe (151170] . 

Appendix C The Grid Engineering Process based on Complex Net- 
work Analysis 

In our previous analysis work |74) we considered a topological analysis of the Dutch Medium and Low Voltage 
Power Grid, while in this work we generate synthetic networks to assess which ones are better to support a Smart 
Grid where prosumers exchange energy at local scale. Based on both these studies we can define an engineering 
process to upgrade the existing infrastructures towards a Smart Grid of prosumers. The engineering process is 
thus based on Complex Network Analysis metrics and techniques. 

This process is intended for energy distribution companies to assess what is the current state of their infras- 
tructures considering the influence of the topology on the electricity transport prices. In a totally unbundled 
scenario for the electricity market the distribution company might be incentivized in providing a better infras- 
tructure closer to prosumer and consumer needs. Distribution company might charge them based on indicators 
that not only take into account downtime periods, but also topological efficiency which is based on the influence 



of topology on electricity prices. Figure 33 presents the flow of this process. Each big rectangular box represents 
a phase of this process and each contains a number of operation in a flow represented by small rectangles. The 
initial phase is basically an analysis of the existing infrastructure and computation of its topological properties 
(Phase 1 light orange box) extracted from Grid sample data input (trapezoidal block in the figure). Economic 
factors (Phase 2 violet box) play a role too which are accounted in considering the desired costs in electricity 
distribution and the investment available by the electricity distribution company (trapezoidal blocks in Phase 2 
box). The match to the actual infrastructure to the desired one is realized (Phase 3 light yellow box) and reports 
for cost benefits and cost for the investment are provided (trapezoidal blocks in Phase 3 box). In particular, each 
phase has more articulated processes internally which are detailed below. 
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Figure 33: Engineering process for Medium and Low Voltage Grid optimized for prosumer-based energy exchange. 

The first phase of the process starts with the acquisition of the (complete) network topological information, 
that is, information about the nodes of the Grid (substations, transformers, end-users) and the lines connecting 
those nodes (cables and links). Also physical parameters characterizing the cables are necessary such as resistance 
per unit of length, length of cables and capacity of cables (supported current). Once the information is available it 
is possible to build a Power Grid graph. The following step is performing a Complex Network Analysis extracting 
the metrics that are essential to assess the influence of topology in electricity prices as shown in |Appcndix B| 
(e.g., weighted characteristic path length, average number of nodes involved in the shortest path...). From these 
metrics the summary indicators that relate electricity prices to topology are computed, namely, the a and j3 
values are extracted as shown in Section [7] and | Appendix B| 

The second phase represents the input of the requirements for the evolution of the Grid. These translate 
into constrains for generation of a network topology which satisfies the desiderata parameters of the electricity 
distributor provider, or any other actor (e.g., the municipality, a cooperative of users in the neighborhood, a 
venture capitalist), interested in realizing a Distribution Network that is more prone to the small-scale energy 
exchange paradigm. The stakeholder in the network defines a cost for the electricity distribution (most likely a 
range in the cost). This target cost is translated into topological measures (a and /3 parameters) that the target 
network should satisfy. In addition to the constraint regarding a and /3 parameters, the stakeholder provides 
additional constraints such as the number of nodes (or transmission lines) the network without improvements 
should have (it could be the same as the original sample or be different in case of planned increase/decrease in 
the network assets) and the available budget. The budget quantifies the investment in realizing/upgrading the 
network to make it more prone to prosumer-based energy exchange (this influences the possibility of increasing 
the number of substations and power lines) . 

The third phase consists of adapting the physical sample network to the synthetic one, once the two sets 
of topological measures coming from phase 1 and 2 have been compared. This phase therefore provides a new 
network that is optimized for the local-scale energy exchange considering the constrains given in Phase 2. Once 
the network is available it is then possible to compare it with the physical Power Grid sample in order to evaluate 
the presumed benefits in terms of topology and its advantages in electricity distribution costs. On the other hand 
it is possible to evaluate the foreseen cost for the investment to achieve this kind of network. 

We leave as future work to define in details all the steps of the engineering process. Though most of pieces 
of the puzzles are there: Phase one is covered by [73] : while the current work covers Phase 2 of the picture by 
offering ways to compare synthetic and physical samples in terms of a and /3. In addition, an assessment of the 
costs required to build synthetic networks is also given as an aspect of evaluation in the current work. 
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